Hp vertica 7.2.x_complete_documentation

Vertica Documentation
HPE Vertica Analytics Platform
Software Version: 7.2.x
Document Release Date: 4/21/2016

Legal Notices
Warranty
The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty
statements accompanying such products and services. Nothing herein should be construed as constituting an
additional warranty. HPE shall not be liable for technical or editorial errors or omissions contained herein.
The information contained herein is subject to change without notice.
Restricted Rights Legend
Confidential computer software. Valid license from HPE required for possession, use or copying. Consistent with
FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data
for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.
Copyright Notice
© Copyright 2006 - 2013 Hewlett Packard Enterprise Development LP
Trademark Notices
Adobe™ is a trademark of Adobe Systems Incorporated.
Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.
UNIX® is a registered trademark of The Open Group.
This product includes an interface of the 'zlib' general purpose compression library, which is Copyright © 1995-
2002 Jean-loup Gailly and Mark Adler.
HPE Vertica Analytics Platform (7.2.x) Page 2 of 5309

Contents
Supported Platforms 5
New Features 19
Installing Vertica 67
Getting Started 218
Vertica Concepts 279
Administrator's Guide 338
Analyzing Data 1313
Using Flex Tables 1562
Using Management Console 1713
SQL Reference Manual 1841
Security and Authentication 3540
Extending Vertica 3640
Connecting to Vertica 3873
Integrating with Hadoop 4308
Integrating with Apache Kafka 4439
Vertica Place 4471
Vertica Pulse 4644
Best Practices for OEM Customers 4749
Vertica Plug-In for Informatica 4779
Glossary 4825
Third-Party Software Acknowledgements 4868
Contents

of 5309HPE Vertica Analytics Platform (7.2.x)
Contents

Supported Platforms

Vertica Server and Vertica Management Console
Supported Operating Systems and Operating System
Versions
Hewlett Packard Enterprise supports Vertica Analytics Platform 7.2.x running on the
following 64-bit operating systems and versions on x86_x64 architecture.
Important: This document reflects what has been tested with Vertica Analytics
Platform 7.2.x. Be aware that operating system vendors and communities release
updates and patches for all versions of their operating systems on their own
schedules. Such updates to these operating systems may or may not coincide with
the release schedule for Vertica. While other versions of the operating systems listed
may have been successfully deployed by customers in their environments, stability
and performance of these configurations may vary. If you choose to run Vertica on an
operating system version not listed in this document and experience an issue, the
Vertica Support team may ask you to reproduce the issue using one of the
configurations described in this document to aid in troubleshooting. Depending on
the details of the case, the Support team may also ask you to enter a support ticket
with your operating system vendor.
Red Hat Enterprise Linux
l All versions starting at 6.0 up to and including 6.7
l Version 7.0
Important: You cannot perform an in-place upgrade of your current Vertica Analytics
Platform from Red Hat Enterprise Linux 6.0 — 6.7 to Red Hat Enterprise Linux 7.0.
For information how to upgrade to Red Hat Enterprise Linux 7.0, see the Migration
Guide for Red Hat 7/CentOS 7.
CentOS
l All versions starting at 6.0 up to and including 6.7
l Version 7.0
Important: You cannot perform an in-place upgrade of your current Vertica Analytics

Platform from CentOS 6.0 — 6.7 to CentOS 7.0. For information on how to upgrade
to CentOS 7.0, see the Migration Guide for Red Hat 7/CentOS 7.
SUSE Linux Enterprise Server
All versions starting at 11.0 up to and including 11.0 SP3.
Oracle Enterprise Linux 6
Red Hat Compatible Kernel only - Vertica is not supported on the unbreakable kernel
(kernels with a uel suffix).
Debian Linux
All versions starting at 7.0 up to and including 7.7.
Ubuntu
l Version 12.04 LTS
l Version 14.04 LTS
When there are multiple minor versions supported for a major operating system release,
Hewlett Packard Enterprise recommends that you run Vertica on the latest minor version
listed in the supported versions list. For example, if you run Vertica on the Debian Linux
7.0 major release, Hewlett Packard Enterprise recommends you use version 7.7.
However, for both Red Hat Enterprise Linux and CentOS, Hewlett Packard Enterprise
recommends using any of the following versions for the 6.0 major release: 6.5, 6.6, or
6.7
Supported File Systems
Vertica Analytics Platform Enterprise Edition has been tested on all supported Linux
platforms running ext3 or ext4 file systems. For the Vertica Analytics Platform I/O profile,
the ext4 file system is considerably faster than ext3.
While other file systems have been successfully deployed by some customers, Vertica
Analytics Platform cannot guarantee performance or stability of the product on these file
systems. In certain support situations, you may be asked to migrate off of these
unsupported file systems to help you troubleshoot or fix an issue. In particular, several
file corruption issues have been linked to the use of XFS with Vertica; Hewlett Packard
Enterprise strongly recommends not using it in production.
Important: Vertica Analytics Platform 7.2.x does not support Linux Logical Volume

Manager (LVM).
Supported Browsers for Vertica Management Console
Vertica Analytics Platform 7.2.x Management Console is supported on the following
web browsers:
l Internet Explorer 10 and later
l Firefox 31 and later
l Google Chrome 38 and later
Vertica Server and Management Console Compatibility
Each version of Vertica Analytics Platform Management Console is compatible only
with the matching version of the Vertica Analytics Platform server. For example, the
Vertica Analytics Platform 7.2 server is supported with Vertica Analytics Platform 7.2
Management Console only.
Vertica 7.2.x Client Drivers
Vertica provides JDBC, ODBC, OLE DB, and ADO.NET client drivers. You can choose
to download:
l Linux and UNIX-like platforms — ODBC client driver, client RPM, and vsql client.
See Installing the Client Drivers on Linux and UNIX-Like Platforms.
l Windows platforms — ODBC, ADO.NET, and OLE DB client drivers, the vsql client,
the Microsoft Connectivity Pack, and the Visual Studio plug-in. See Installing the
Client Drivers and Tools on Windows.
l Mac OS X platforms — ODBC client driver and vsql client. See Installing the Client
Drivers on Mac OS X.
l The cross-platform JDBC client driver .jar file available for installation on all
platforms.
See Vertica Driver/Server compatibility to see which server versions are compatible with
Vertica 6.1.x drivers.
ADO.NET Driver
The ADO.NET driver is supported on the following platforms:

Platform Processor Supported
Versions
.NET Requirements
Microsoft Windows x86 (32-bit) Windows 7
Windows 8
Windows 10
Microsoft .NET Framework
3.5 SP1 or later
Windows 8
Windows 10
Microsoft Windows
Server
x86 (32-bit) 2008
2008 R2
Microsoft Windows
Server
x64 (64-bit) 2008
2008 R2
2012
JDBC Driver
All JDBC drivers are supported on any Java 5-compliant platform. (Java 5 is the
minimum.)

ODBC Driver
Vertica Analytics Platform provides both 32-bit and 64-bit ODBC drivers. Vertica 7.2.x
ODBC drivers are supported on the following platforms:
Versions
Driver Manager
Windows 8
Windows 10
Microsoft ODBC
MDAC 2.8
Windows 8
Windows 10
Microsoft Windows
Server
x86 (32-bit) 2008
2008 R2
Microsoft Windows
Server
x64 (64-bit) 2008
2008 R2
2012
Red Hat Enterprise
Linux
x86_64 6 and 7 iODBC 3.52.6 or
later
unixODBC 2.3.0
or later
DataDirect 5.3
and 6.1 or later
SUSE Linux
Enterprise
x86_64 11
Oracle Enterprise
Linux (Red Hat
Compatible Kernel
only)
x86_64 6
CentOS x86_64 6 and 7
Ubuntu x86_64 12.04 LTS

Versions
Driver Manager
14.04 LTS
AIX PowerPC 5.3 and 6.1 iODBC 3.52.6 or
later
unixODBC 2.3.0
or later
DataDirect 5.3
and 6.1 or later
HP-UX IA-64 11i V3 iODBC 3.52.6 or
later
unixODBC 2.3.0
or later
DataDirect 5.3
and 6.1 or later
Solaris SPARC 10
Mac OS X x86_64 10.7, 10.8, and
10.9

Vertica Analytics Platform Driver/Server Compatibility
The following table indicates the Vertica Analytics Platform driver versions that are
supported by different Vertica Analytics Platform server versions.
Note: SHA password security is supported on client driver and server versions 7.1.x
and later.
Client Driver Version Compatible Server Versions
6.1.x 6.1.x, 7.0.x, 7.1.x, 7.2.x
7.0.x 7.0.x, 7.1.x, 7.2.x
7.1.x 7.1.x, 7.2.x
7.2.x 7.2.x
vsql Client
The Vertica vsql client is included in all client packages; it is not available for download
separately. The vsql client is supported on the following platforms:
Operating System Processor
Microsoft Windows
l Windows 2008 & 2008 R2, all variants
l Windows 2012, all variants
l Windows 7, all variants
l Windows 8.*, all variants
l Windows 10
x86, x64
Red Hat Enterprise Linux 6 x86, x64
Red Hat Enterprise Linux 7 x86, x64
SUSE Linux Enterprise 11 x86, x64

Operating System Processor
Oracle Enterprise Linux 6 (Red Hat Compatible Kernel only) x86, x64
CentOS 6 x86, x64
CentOS 7 x86, x64
Ubuntu 12.04LTS x86, x64
Solaris 10 x86, x64, SPARC
AIX 5.3, 6.1 PowerPC
HPUX 11i V3 IA32, IA64
Mac OS X 10.7, 10.8, 10.9 x86, x64
Perl and Python Requirements
You can use Vertica's ODBC driver to connect applications written in Perl or Python to
the Vertica Analytics Platform.
Perl
To use Perl with Vertica, you must install the Perl driver modules (DBI and DBD::ODBC)
and a Vertica ODBC driver on the machine where Perl is installed. The following table
lists the Perl versions supported with Vertica 7.2.x.
Perl Version Perl Driver Modules ODBC Requirements
l 5.8
l 5.10
l DBI driver version 1.609
l DBD::ODBC version
1.22
See Vertica 7.2.x Client Drivers.
Python
To use Python with Vertica, you must install the Vertica Python Client or the pyodbc
module and a Vertica ODBC driver on the machine where Python is installed. The
following table lists the Python versions supported with Vertica 7.2.x:

Python
Version
Python Driver Module ODBC Requirements
2.4.6 pyodbc 2.1.6 See Vertica 7.2.x Client Drivers.
2.7.x Vertica Python Client
(Linux only)
2.7.3 pyodbc 3.0.6
3.3.4 pyodbc 3.0.7
Vertica SDKs
This section details software requirements for running User Defined Extensions (UDxs)
developed using the Vertica SDKs.
C++ SDK
The Vertica cluster does not have any special requirements for running UDXs written in
C++.
Java SDK
Your Vertica cluster must have a Java runtime installed to run UDxs developed using
the Vertica Java SDK. HPE has tested the following Java Runtime Environments
(JREs) with this version of the Vertica Java SDK:
l Oracle Java Platform Standard Edition 6 (version number 1.6)
l Oracle Java Platform Standard Edition 7 (version number 1.7)
l OpenJDK 6 (version number 1.6)
l OpenJDK 7 (version number 1.7)
R Language Pack
The Vertica R Language Pack provides version 3.0.0 of the R runtime and associated
libraries for interfacing with Vertica. You install the R Language Pack on the Vertica
server.
Vertica Integrations for Hadoop
Vertica 7.2.x is supported with these Hadoop distributions:

Distribution Version
Cloudera (CDH) 5.2
5.3
5.4
HortonWorks Data Platform (HDP) 2.2 and 2.3
MapR 3.1.1
4.0
Vertica Connector for Hadoop MapReduce
HP provides a module specific to Hadoop for Vertica client machines.
Vertica provides a Connector for Apache Hadoop 2.0.0. The table below details
supported software versions.
Description Vertica Connector for Apache Hadoop
MapReduce for Hadoop 2.0.0
Apache Hadoop and Pig
Combinations
Apache Hadoop 2.0.0 and Pig 0.10.0
Cloudera Distribution
Versions
Cloudera Distribution Including Apache Hadoop
(CDH) 4
Packs, Plug-Ins, and Connectors for Vertica Client
Machines
HPE provides the following optional module for Vertica client machines.
Informatica PowerCenter Plug-In
The Vertica plug-in for Informatica PowerCenter is supported on the following platforms:
Plug-in
Version
Operating System Informatica
PowerCenter Versions
Vertica
Versions
7.x Microsoft Windows
l Windows 2003 & 2003
R2 all variants
9.x 6.x (limited
functionality)
7.x (all

Plug-in
Version
Operating System Informatica
PowerCenter Versions
Vertica
Versions
l Windows 2008 & 2008
R2 all variants
l Windows 7 all variants
Red Hat Enterprise Linux 5
(32 and 64 bit)
Solaris
AIX
HP-UX
enhancements)
Vertica on Amazon Web Services
HPE provides a preconfigured AMI for users who want to run Vertica Analytics Platform
on Amazon Web Services (AWS). This HP-supplied AMI allows users to configure their
own storage and has been configured for and tested on AWS. This AMI is the officially
supported version of Vertica Analytics Platform for AWS.
Note that HPE develops AMIs for Vertica on a slightly different schedule than our
product release schedule. Therefore, AMIs will be available for Vertica releases
sometime following the initial release of Vertica software.
Vertica in a Virtualized Environment
Vertica runs in the following virtualization environment:
Important: Vertica does not support suspending a virtual machine while Vertica is
running on it.
Host
l VMware version 5.5
l The number of virtual machines per host did not exceed the number of physical
processors

l CPU frequency scaling turned off at the host level and for each virtual machine
l VMware parameters for hugepages set at version. 5.5 defaults
Input/Output
l Measured by vioperf concurrently on all Vertica nodes When running vioperf, provide
the –duration=2min option and start on all nodes concurrently
l 25 megabytes per second per core of write
l 20+20 megabytes per second per core of rewrite
l 40 megabytes per second per core of read
l 150 seeks per second of latency (SkipRead)
l Thick provisioned disk, or pass-through-storage
Network
l Dedicated 10G NIC for each Virtual Machine
l No oversubscription at the switch layer, verified with vnetperf
Processor
l Architecture of Sandy Bridge (HP Gen8 or higher)
l 8 or more virtual cores per virtual machine
l No oversubscription
l vcpuperf time of no more than 12 seconds ( ~= 2.2 GHz clock speed)
Memory
l Pre-allocate and reserve memory for the VM
l 4G per virtual core of the virtual machines
HP has tested the configuration above. While other virtualization configurations may
have been successfully deployed by customers in development environments,

performance of these configurations may vary. If you choose to run Vertica on a different
virtualization configuration and experience an issue, the Vertica Support team may ask
you to reproduce the issue using the configuration described above, or in a bare-metal
environment, to aid in troubleshooting. Depending on the details of the case, the
Support team may also ask you to enter a support ticket with your virtualization vendor.
Vertica Integration for Apache Kafka
You can use Vertica with the Apache Kafka message broker. Vertica supports the
following Kafka distributions:
Apache Kafka Versions Vertica Versions
0.8.x 7.2 and later
0.9.0 7.2 SP2
For more information on Kafka integration, refer to How Vertica and Apache Kafka Work
Together.

New Features

New Features and Changes in Vertica 7.2.2
Read the topics in this section for information about new and changed functionality in
Vertica 7.2.2.
Upgrade and Installation
This section contains information on updates to upgrading or installing Vertica Analytics
Platform 7.0.x.
More Details
For information see Installing Vertica.
Using the admintools Option --force-reinstall to Reinstall Packages
When you upgrade Vertica and restart your database, Vertica automatically upgrades
packages such as the package flextable for flex tables. Vertica issues a warning if it
fails to reinstall one or more packages. Failed reinstallation could occur if, for example,
your user entered an incorrect password upon restart of the database after upgrade. You
can force installation of packages by issuing the new admintools install_package option,
--force-reinstall.
For information on the --force-reinstall option, see Upgrading and Reinstalling
Packages in Installing Vertica.
Setting the pid_max Before Installation
Vertica now requires that pid_max be set to at least 524288 before installing or
upgrading the database platform. This new requirement ensures that there are enough
pids for all the necessary system and Vertica processes.
For more information, see pid_max Setting
Client Connectivity
This section contains information on updates to connection information for Vertica
Analytics Platform 7.0.x.
More Details
For more information see .
Last Line Record Separator to vsql
You can add a record separator to the last line of vsql output. The -Q command-line
option and pset option pset trailingrecordsep enable trailing record separators.

For more information about adding a record separator, please see Command-Line
Options and pset NAME [ VALUE ].
Setting a Client Connection Label After Connecting to Vertica
You can change the client connection label after connecting to Vertica. For more
information about this feature, see Setting a Client Connection Label.
Support for Square-Bracket Query Identifiers with ODBC and
OLE DB Connections
When connecting to a Vertica database with an ODBC or OLE DB connection, some
third-party clients require you to use square-bracket identifiers when writing queries.
Previously, Vertica did not support this type of identifier. Vertica now supports square
bracket identifiers and allows queries from those third-party platforms to run
successfully.
For more information, see:
l Data Source Name (DSN) Connection Parameters
l OLE DB Connection Properties
Security and Authentication
This section contains information on updates to security and authentication features for
Vertica Analytics Platform 7.0.x.
More Details
For more information see Security and Authentication.
Row-Level Access Policy
You can create access policies at the row level to restrict access to sensitive information
to only those users authorized to view it.
Along with the existing column access policy, the new row-level access policy provides
improved data security.
For more information see Access Policies.
Data Analysis
This section contains information on updates to data analysis for Vertica Analytics
Platform 7.0.x.

More Details
For more information see Analyzing Data
Machine Learning for Predictive Analytics
Machine Learning for Predictive Analytics is an analytics package that allows you to use
machine learning algorithms on existing data in your Vertica database. Machine
Learning for Predictive Analytics is included with the Vertica server and does not need
to be downloaded separately.
Machine Learning for Predictive Analytics features include:
l In-database predictive modeling for regression problems, using linear regression
l In-database predictive modeling for classification problems, using logistic regression
l In-database data clustering, using k-means
l Evaluation functions to determine the accuracy of your predictive models
l Normalization functions to use for data preparation
For more information see Machine Learning for Predictive Analytics: An Overview.
Query Optimization
This section contains information on updates to Optimization for Vertica Analytics
Platform 7.0.x.
More Details
For more information see Optimizing Query Performance.
Controlling Inner and Outer Join Inputs
ALTER TABLE supports the FORCE OUTER option, which lets you control join inputs for
specific tables. You enable this option at the database and session scopes through the
configuration parameter EnableForceOuter.
For details, see Controlling Join Inputs.
DISTINCT Support in Multilevel Aggregation
Multilevel aggregation supports SELECT..DISTINCT and aggregate functions that
specify the DISTINCT option (AVG, COUNT, MIN, MAX, and SUM).
For example, Vertica supports statements such as the following:

SELECT DISTINCT catid, year FROM sales GROUP BY ROLLUP(catid,year) HAVING GROUPING (year)=0;
SELECT catid, SUM(DISTINCT year), SUM(DISTINCT month), SUM(price) FROM sales GROUP BY ROLLUP
(catid) ORDER BY catid;
Guaranteed Uniqueness Optimization
Vertica can identify in a query certain columns that guarantee unique values, and use
them to optimize various query operations. Any columns that are defined with the
following attributes contain unique values:
l Columns that are defined with AUTO_INCREMENT or IDENTITY constraints
l Primary key columns where key constraints are enforced
l Columns that are constrained to unique values, either individually or as a set
l A column that is output from one of the following SELECT statement options:
GROUP BY, SELECT..DISTINCT, UNION, INTERSECT, and EXCEPT
Vertica can optimize queries when these columns are included in the following
operations:
l Left or right outer join
l GROUP BY clause
l ORDER BY clause
For example, a view might contain tables that are joined on columns where key
constraints are enforced, and the join is a left outer join. If you query on this view, and
the query contains only columns from a subset of the joined tables, Vertica can optimize
view materialization by omitting the unused tables.
You enable guaranteed uniqueness optimization through the configuration parameter
EnableUniquenessOptimization. You can set this parameter at database and
session scopes. By default, this parameter is set to 1 (enabled).
Tables
This section contains information on updates to Tables information for Vertica Analytics
Platform 7.0.x.

More Details
For more information see Managing Tables.
Changing External Table Column Types
You can change the data types of columns in external tables without deleting and re-
creating the table. Because external tables do not store their data in Vertica ROS files,
data does not need to be transformed. For details, see Changing Column Data Type.
You can use this feature when working with Hive tables accessed through the HCatalog
Connector, which are external tables. See Synchronizing an HCatalog Schema With a
Local Schema.
Loading Data
This section contains information about loading data Vertica Analytics Platform 7.0.x.
More Details
For more information see:
Vertica Library for Amazon Web Services
Using Flex Tables
COPY
Vertica Library for Amazon Web Services
The Vertica library for Amazon Web Services (AWS) is a set of functions and
configurable session parameters. These parameters allow you to directly exchange data
between Vertica and Amazon S3 storage without any third-party scripts or programs.
For more information, see the Vertica Library for Amazon Web Services topic.
Performance Enhancements for favroparser
This release includes significant performance improvements when loading Avro files
into columnar tables with the favroparser. These improvements do not affect loading
Avro files into flex tables.
For more information about using this parser, see Loading Avro Data and
FAVROPARSER in the Using Flex Tables guide.
Using Copy ERROR TOLERANCE to Treat Each Source
Independently
Vertica treats each source independently when loading data using the ERROR
TOLERANCE parameter. Thus if a source has multiple files and one of them is invalid,

then the load continues, but the invalid file does not load.
For more information about this feature, see ERROR TOLERANCE in the topic
Parameters.
Management Console
This section contains information on updates to the Management Consol for Vertica
More Details
For more information see Using Management Console.
Email Alerts in Management Console
Management Console can generate email alerts when high-priority thresholds are
exceeded. When you set a threshold to Priority 1, you can subscribe users to receive
alerts through email when that threshold is exceeded.
Use the Email Gateway tab in MC Settings to enable MC to send email alerts. See Set
Up Email .
You can subscribe to email alerts using the Thresholds tab, which appears on your
database's Settings page. See Customizing Message Thresholds.
MC Message Center Notification Menu
In Management Console 7.2.2, you can view a preview of your most recent messages
without navigating away from your current page. Click the Message Center icon in the
top-right corner of Management Console to view the new notification menu. From this
menu, you can delete, archive, or mark your messages as read.
To visit the Message Center, click the Message Center link in the top-right corner of the
notification menu.
For more information, see Monitoring Database Messages in MC.
Syntax Error Highlighting During Query Planning
Management Console can show you query plans in an easy-to-read format on the Query
Plan page. If the query you submit is invalid, Management Console highlights the parts
of your query that might have caused a syntax error. You can immediately identify errors
and correct an invalid query.
For more information about query plans in Management Console, see Managing
Queries in MC.

MC Time Information
Using the MC REST API, you can retrieve the following information about the MC
server:
l The MC server's current time
l The timezone where the MC server is located
For more information, see GET mcTimeInfo.
Timestamp Range Parameters with MC Alerts
You can specify that calls to the MC API for MC alerts, their current status, and database
properties return only those alerts within specific timestamp ranges.
For more information, see GET alerts.
System Table Updates
This section contains information on updates to System Tables for Vertica Analytics
Platform 7.0.x.
More Details
For more information see Vertica System Tables.
OS User Name Column
The following system tables have been updated to include the CLIENT_OS_USER_
NAME column:
l CURRENT_SESSION
l LOGIN_FAILURES
l SESSION_PROFILES
l SESSIONS
l SYSTEM_SESSIONS
l USER_SESSIONS
This column logs the username of any user who logged into, or attempted to log into, the
database.

KEYWORDS System Table
The V_CATALOG schema includes a new system table, KEYWORDS. You can query
this table to obtain a list of Vertica reserved and non-reserved keywords.
REMOTE_REPLICATION_STATUS System Table
The V_MONITOR schema includes a new system table, REMOTE_REPLICATION_
STATUS. You can query this table to view the status of objects being replicated to an
alternate cluster.
TRUNCATED_SCHEMATA System Table
The V_MONITOR schema includes a new system table, TRUNCATED_SCHEMATA.
You can query this table to view the original names of restored schemas that were
truncated due to length.
SQL Functions and Statements
This section contains information on updates to SQL Functions and Statements for
More Details
For more information see the .
Regular Expression Functions
All of the Vertica regular expression functions support LONG VARCHAR strings. This
capability allows you to use the Vertica regex functions on __raw__ columns in flex or
columnar tables. Using any of the regex functions to pattern match in __raw__ columns
requires casting.
These are the current functions:
l REGEXP_COUNT
l REGEXP_ILIKE
l REGEXP_INSTR
l REGEXP_NOT_ILIKE
l REGEXP_NOT_LIKE
l REGEXP_LIKE

l REGEXP_REPLACE
l REGEXP_SUBSTR
This Vertica version adds the these functions to the documentation:
l REGEXP_ILIKE
l REGEXP_NOT_ILIKE
l REGEXP_NOT_LIKE
Backup, Restore, and Recovery
This section contains information on updates to backup and restore operations for
More Details
For more information see Backing Up and Restoring the Database.
Backup Integrity Check
Vertica can confirm the integrity of your database backups. For more information about
this feature, see Checking Backup Integrity.
Backup Repair
Vertica can reconstruct backup manifests and remove unneeded backup objects. For
more information about this feature, see Repairing Backups
Recover Tables in Parallel
During a recovery, Vertica recovers multiple tables in parallel. For more information on
this feature, see Recovery By Table.
Replicate Tables and Schemas to an Alternate Database
You can replicate Vertica tables and schemas from one database to alternate clusters in
your organization. Using this strategy helps you:
l Replicate objects to a secondary site.
l Move objects between test, staging, and production clusters.
For more information on this feature, see Replicating Tables and Schemas to an
Alternate Database

Place
This section contains information on updates to Vertica Place for Vertica Analytics
Platform 7.0.x.
More Details
For more information see Vertica Place.
Additional WGS84 Supported Functions
WGS84 support has been added to the following functions:
l ST_Contains
l ST_Disjoint
l ST_Intersects
l ST_Touches
l ST_Within
For information about which data types and data type combinations are supported,
please view the function's reference page.
Exporting Spatial Data to a Shapefile
Vertica supports exporting spatial data as a shapefile. For more information about this
feature, see Exporting Spatial Data from a Table.
Use STV_MemSize to Optimize Tables for Geospatial Queries
STV_MemSize returns the amount of memory used by a spatial object. When you are
optimizing your table for performance, you can use this function to get the optimal
column width of your data.
For more information about this feature, see STV_MemSize.
Kafka Integration
This section contains information on updates to Kafka Integration for Vertica Analytics
Platform7.0.x.
More Details
For more information see Integrating with Apache Kafka.

Apache Kafka 0.9 Support
Vertica supports Apache Kafka 0.9 integration.
Multiple Kafka Cluster Support
Vertica can accept data from multiple Kafka clusters streaming into a single Vertica
instance. For more information, refer to Kafka Cluster Utility Options in Kafka Utility
Options.
Parse Custom Formats
Vertica supports the supports the use of user-defined filters to manipulate data arriving
from Kafka. You can apply these filters to data before you parse it. For more information
about this feature, see Parsing Custom Formats.
Parse Kafka Messages Without Schema and Metadata
The KafkaAVROParser includes the parameter with_metadata. When set to TRUE, the
KafkaAVROParser parses messages without including object and schema metadata.
For more information about this feature, see Using COPY with Kafka.
kafka_clusters System Table
The kafka_config schema includes a new system table, kafka_clusters. You can query
this table to view your Kafka clusters and their constituent brokers.
Vertica 7.2.1.
Supported Platforms
This section contains information on updates to supported platforms for Vertica Analytics
Platform 7.0.x.
More Details
For complete information on platform support see Vertica 7.2.x Supported Platforms.
Software Download
For more information on changes to the operating system in the Red Hat 7 release, see
the Red Hat Enterprise Linux 7 documentation.
Windows 10 Support for Client Drivers
Vertica has added Windows 10 support for the following client drivers:

l ADO.NET, both 32 and 64-bit
l JDBC
l ODBC, both 32 and 64-bit
l Vertica vsql
For more information see Vertica 7.2.x Client Drivers
More Details
Authentication Support for Chained Users and Roles
You can now enable authentication for a chain of users and roles, rather than enabling
authentication for each user and role separately.
For more information see Implementing Client Authentication.
Restrict System Tables
The new security parameter RestrictSystemTables prohibits users from accessing
sensitive information from some system tables.
For more information see System Table Restriction.
Query Optimization
Platform 7.0.x.
More Details
Batch Export of Directed Queries
This release provides two new meta-functions that let you batch export directed queries
from one database to another. These tools are useful for saving query plans before a

scheduled version upgrade:
l EXPORT_DIRECTED_QUERIES batch exports query plans as directed queries to an
external SQL file.
l IMPORT_DIRECTED_QUERIES lets you selectively import query plans that were
exported by EXPORT_DIRECTED_QUERIES from another database.
For more information on using these tools, see Batch Query Plan Export.
UTYPE hint
The UTYPE hint specifies how to combine UNION ALL input.
More Details
For more information see the SQL Reference Manual.
THROW_ERROR Function
This release adds the function, throw_error, which allows you to generate arbitrary
errors. For more information, see THROW_ERROR.
Management Console
This section contains information on updates to the Management Console for Vertica
More Details
For more information see Management Console.
Management Console Message Center Redesign
Management Console now brings focus to time-sensitive and high priority messages
with a redesigned Message Center.
The new Recent Messages inbox displays your messages from the past week, while
messages you have read are now archived in the Archived Messages inbox. You can
also click Threshold Messages to view only alerts about exceeded thresholds in your
database.

To help you prioritize the messages you view, Message Center displays the number of
messages categorized as High Priority, Needs Attention, and Informational. You can
click any of these values to filter by that priority.
For more information about Message Center, see Monitoring Database Messages in
MC.
Password Requirements in Management Console
Starting with Vertica 7.2.1, Management Console (MC) passwords must contain 5 - 25
characters. Passwords created in previous versions of MC continue to work regardless
of length. If you change a password after updating to 7.2.1, MC enforces the new length
requirement.
Platform 7.0.x.
More Details
License Audits System Table
Vertica Concepts has changed the LICENSE_AUDITS system table column LICENSE_
NAME to AUDITED_DATA.
Refer to the section, LICENSE_AUDITS in the SQL Reference Manual for more
information.
Licenses System Table
Vertica Concepts has changed the LICENSES system table column IS_COMMUNITY_
EDITION to IS_SIZE_LIMIT_ENFORCED.
Refer to the section, LICENSES in the SQL Reference Manual for more information.
SDK Updates
This section contains information on updates to the SDK for Vertica Analytics Platform
7.0.x.
More Details
For more information see the Java SDK Documentation.

Enhancements for C++ User-Defined Function Parameters
This release adds new functionality for C++ user-defined function parameters. For more
information, see:
l Defining the Parameters Your UDx Accepts
l Specifying the Behavior of Passing Unregistered Parameters
l USER_FUNCTION_PARAMETERS
Management Console API Alert Filters
New filters are available for when you use the Vertica REST API to retrieve information
on alerts configured in Management Console. For information on applying these
category and sub-category filters, see:
l Thresholds Category Filter
l Combining Sub-Category Filters with Category Filters
l Database Name Category Filter
Text Search Updates
This section contains information on updates to Text Search for Vertica Analytics
Platform 7.0.x.
More Details
For more information see Using Text Search.
Text Search Features
This release adds the following functionality for text search:
l Tokenizers are now polymorphic and can accept any number and type of columns.
l Text indices can now contain multiple columns from their source table.
For more information see Using Text Search.
Vertica 7.2.0.

Supported Platforms
This section contains information on updates to supported platforms for Vertica Analytics
Platform 7.0.x.
More Details
For complete information on platform support see Vertica 7.2.x Supported Platforms.
Software Download
For more information on changes to the operating system in the Red Hat 7 release, see
the Red Hat Enterprise Linux 7 documentation.
Support for Red Hat Enterprise Linux 7 and CentOS 7
This release adds support for Red Hat Enterprise Linux 7 and CentOS 7 on HPE Vertica
Analytics Platform and Management Console.
You cannot perform a direct upgrade with Vertica Analytics Platform 7.0.x from Red Hat
6.x to 7 or CentOS 6.x to 7. For information on how to upgrade to Red Hat 7 or CentOS
7, see Migration Guide for Red Hat 7/CentOS 7.
Requirements Before Upgrading or Installing
This section contains updated information on the tasks you must complete before you
install or upgrade Vertica.
Dialog Package
Vertica now requires the dialog package to be installed on all nodes in your cluster
before installing or upgrading the database platform.
See Package Dependencies for more information.
Licensing and Auditing
Vertica 7.0.x has changed its licensing scheme and the way it audits licenses.
More Details
For complete license information see Managing Licenses.
Vertica Enterprise Edition License Is Now Premium Edition
Enterprise Edition is now Premium Edition.

l If you have a current Enterprise Edition license, it is still valid for use with Vertica
7.2.x.
l The new Premium Edition includes Flex Tables. You no longer need a separate
license for Flex Zone. (The new Premium Edition does not include a Flex Table data
limit; you can add Flex Table data up to your general license limit. Flex data counts
as 1/10th the cost of regular data towards your license limits.)
l Premium Edition includes all Vertica functionality. Hadoop requires a separate
license.
For more information see Understanding Vertica Licenses.
Vertica Database Audits
IMPORTANT: The changes in this section became effective with Vertica Release
7.1.2, except where noted.
Vertica has made storage changes related to licensing. As a result, the Vertica database
audit size is calculated differently and is reduced from audit sizes calculated prior to
Vertica Version 7.1.2.
Vertica now computes the effective size of the database based on the export size of the
data.
l Vertica no longer counts a 1-byte delimiter value in the effective size of the database.
Instead, the Vertica audit license size is now based solely on the data width.
l Vertica no longer adds a 1-byte value to account for each delimiter. Under the new
sizing rules, null values are free. Thus, Vertica audit size may be greatly reduced from
the previous version audit size.
l As of Vertica Release 7.0.x, Flex data counts as only 1/10th the cost of non-Flex data.
As a result of these changes, compression ratios show less compression than previous
versions. You can find detailed information on how Vertica calculates database size in
the Calculating the Database Size section of the Administrator's Guide.
Client Connectivity
This section contains information on updates to connection information for Vertica

More Details
For more information see Connecting to Vertica.
Vertica Client Drivers and Tools for Windows
This release adds a new installer, Vertica Client Drivers and Tools for Windows, for
connecting to Vertica. The installer is packaged as an .exe file. You can run the installer
as a regular Windows installer or silently. The installer is compatible with both 32-bit and
64-bit machines.
The installer contains the following client drivers and tools:
l The ODBC Client Driver for Windows
l The OLE DB Client Driver for Windows
l The vsql Client for Windows
l The ADO.NET Driver for Windows
l The Microsoft Connectivity Pack for Windows
l The Visual Studio Plug-in for Windows
For more information on installing the Client Drivers and Tools for Windows, see The
Vertica Client Drivers and Tools for Windows.
Download the latest Client Drivers and Tools for Windows from my.vertica.com (logon
required).
Multiple Active Result Sets (MARS) Support
Vertica now supports multiple active result sets when you use a JDBC connection. For
more information, see Multiple Active Result Sets (MARS).
VHash Class for JDBC
You can use the VHash class as an implementation of the Vertica built-in hash function
when connecting to your database with a JDBC connection. For more information, see
Pre-Segmenting Data Using VHash.
Binary Transfer with ADO.NET Connections
When connecting between your ADO.NET client application and your Vertica database,
you can now use binary transfer instead of string transfer. See ADO.NET Connection
Properties for more information.

Python Client
Vertica offers a Python client that allows you to interface with your database. For more
information, see Vertica Python Client.
More Details
Admintools Remote Calls
Previous to Vertica Analytics Platform 7.0.x, you performed admintools remote calls
using SSH to connect to a remote cluster and then running shell commands. This
configuration allowed system database administration users to perform any system
action without limitation.
This approach prevented organizations from auditing the complete set of actions that
Admintools can perform.
Vertica Analytics Platform 7.0.x addresses this with the following remote python module:
python -m <command>
Using this module allows organizations to limit the system database administration user
to execute python modules under the .../vertica/engine/api directory.
CFS Security
You can now use the Connector Framework Service (CFS) to ingest indexed HPE IDOL
data securely into the Vertica Analytics Platform. This option allows you to use the
Vertica Analytics Platform to perform analytics on data indexed by HPE IDOL.
The new security features control access to specific documents using:
l Access Control Lists (ACL) indicating which users and groups can access a
document.

l Security Information Strings that associate a user/group with a specific ACL.
For more information see Connector Framework Service.
Inherited Privileges
The new Inherited privileges feature allow you to grant privileges at the schema level.
This approach automatically grants privileges to a new or existing table in the schema.
By using inherited privileges, you can:
l Eliminate the need to apply the same privileges to each individual table in the
schema.
l Quickly create new tables that have the necessary privileges for users to perform
required tasks.
For more information see Inherited Privileges Overview in the Administrator's Guide.
LDAP Link
The new LDAP Link service allows the Vertica Server to tightly couple with an existing
Directory service such as MS Active Directory or OpenLDAP. Using the LDAP link
services, you can specify that the Vertica server synchronize:
l LDAP users to Vertica database users
l LDAP groups to Vertica groups
l LDAP user and group membership to Vertica users and roles membership
Any changes to the LDAP Link directory service are reflected in the Vertica database in
near real time. For example, if you create a new user in LDAP, and LDAP Link is active,
that user identity is sent to the Vertica database upon the next synchronization.
For more information see LDAP Link Service.

SYSMONITOR Role
The new System Monitor (SYSMONITOR) role grants access to specific monitoring
utilities without granting full DBADMIN access. This role allows the DBADMIN user to
delegate administrative tasks without compromising security or exposing sensitive
information. A DBADMIN user has the System Monitor role by default.
For more information see SYSMONITOR Role in the Administrator's Guide.
Database Management
This section contains information on updates to database operations for Vertica
More Details
For more information see Managing the Database.
Configuration Parameter ARCCommitPercentage
This parameter sets a threshold percentage of WOS to ROS rows. The specified
percentage determines when to aggregate projection row counts and commit the result
to the Vertica catalog. The default value is 3 (percent).
ARCCommitPercentage provides better control over expensive commit operations, and
helps reduce extended catalog locks.
For more information see General Parameters in the Administrator's Guide.
Admintools Debug Option
Vertica has added a --debug option to the admintools command to assist customers
and customer support. When you enable the debug option, Vertica adds additional
information to associated log files.
Note: Vertica often changes the format or content of log files in subsequent releases
to benefit both customers and customer support.
For information on admintools command options, refer to Writing Administration Tools
Scripts in the Administrator's Guide.
Automatic Eviction of Unresponsive Nodes
This release adds a new capability to detect and respond to unhealthy nodes in an
Vertica database cluster.
See Automatic Eviction of Unhealthy Nodes to learn more.

Reduced Catalog Size
Vertica 7.0.x reduces catalog size by consolidating statistics storage, removing unused
statistics, and storing unsegmented projection metadata once per database, instead of
once per node. More efficient catalog storage provides the following benefits:
l Smaller catalog footprint
l Better scalability with large clusters, facilitating faster analytics, backup, and recovery
l Reduced overhead and fewer bottlenecks associated with catalog size and usage.
For more information see Prepare Disk Storage Locations in Installing Vertica.
Unsegmented Projection Buddies Map to Single Name
Before Vertica 7.0.x, all instances (buddies) of unsegmented projections that were
created by CREATE PROJECTION UNSEGMENTED ALL NODES had unique identifiers:
unseg-proj-name_nodeID
In this identifier nodeID indicated the node that hosted a given projection buddy.
With redesign of the database catalog in Vertica 7.0.x, a single name now maps to all
buddies of an unsegmented projection.
Name Conversion Utility
When you upgrade your database to Vertica 7.0.x, all existing projection names remain
unchanged and Vertica continues to support them. You can use the function merge_
projections_with_same_basename() to consolidate unsegmented projection names
so they conform to the new naming convention. This function takes a single argument,
one of the following:
l An empty string specifies to consolidate all unsegmented projection buddy names
under their respective projection base names. For example:
=> select merge_projections_with_same_basename('');
merge_projections_with_same_basename
--------------------------------------
0
(1 row)
l The base name of the projection whose buddy names you want to convert.

Query Optimization
Platform 7.0.x.
More Details
Database Designer
Database Designer performance and scalability on large (>100 nodes) clusters has
been significantly improved.
For more information see About Database Designer in the Administrator's Guide.
Optimizer Memory Usage
Memory usage by Database Optimizer has been reduced. For more information on the
Database Optimizer see Optimize Query Performance.
Directed Queries and New Query Hints
Directed queries encapsulate information that the optimizer can use to create a query
plan. Directed queries serve two goals:
l Preserve current query plans before a scheduled upgrade.
l Enable you to create query plans that improve optimizer performance.
For more information about directed queries, see Directed Queries in the Administrator's
Guide.
Directed queries rely on a set of new optimizer hints, which you can also use in vsql
queries. For information about these and other hints, see Hints in the SQL Reference
Manual.
JOIN Performance
Vertica 7.0.x improves the performance of hash join queries through parallel
construction of the hash table.
For more information see Joins in Analyzing Data.
Terrace Routing Reduces Buffering Requirements
Terrace routing is a feature that can reduce the buffer requirements of large queries. Use

terrace routing in situations where you have large queries and clusters with a large
number of nodes. Without terrace routing, these situations would otherwise require
excessive buffer space.
For more information see Terrace Routing in the Administrator's Guide.
Changed Behavior for Live Aggregate/Top-K Projections
Live aggregate and Top-K projections now conform to the behavior of other Vertica
projections, as follows:
l Vertica's database optimizer automatically directs queries that specify aggregation to
the appropriate live aggregate or Top-K projection. In previous releases, you could
access pre-aggregated data only by querying live aggregate/Top-K projections; now
you can query the anchor tables.
l Live aggregate/Top-K projections no longer require anchor projections. Vertica now
loads new data directly into live aggregate/Top-K projections.
l Live aggregate/Top-K projections must now explicitly specify the same level of K-
safety or greater that is configured for the Vertica database. Otherwise, Vertica does
not create buddy projections and cannot update projection data.
For more information see Creating Live Aggregate Projections in Analyzing Data.
Pre-Aggregation of UDx Function Results
You can create live aggregate projections that invoke user-defined transform functions
(UDTFs). Vertica processes these functions in the background and stores their results
on disk, to minimize overhead when you query those projections. For more information,
see Pre-Aggregating UDTF Results in Analyzing Data.
A projection can also specify a user-defined scalar function like any other expression.
When you load data into this projection, Vertica stores the function result set for faster
access. See Support for User-Defined Scalar Functions in Analyzing Data.
Improved Queries in Flex Views
This release supports the flex query rewriting to occur more broadly anytime a __raw__
column is present. With this new functionality, a view is considered a flex view if it
includes a __raw__ column and thus supports rewriting.
For more information see Querying Flex Views

JIT Support for Regular Expression Matching
Vertica has upgraded the PCRE library to version 8.37. This upgrade includes Just in
Time (JIT) compilation for the functions used in regular expression matching in SQL
queries. For more information, see the parameter PatternMatchingUseJIT inGeneral
Parameters.
Loading Data
This section contains information about loading data in flex tables, and using Kafka
Integration for Vertica Analytics Platform 7.0.x.
Vertica 7.0.x includes two new parsers for flex tables:
l Avro file parser, favroparser
l CSV (comma-separated values) parser, fcsvparser
This release also extends flex view and __raw__ column queries. Whenever you query
a flex table, Vertica calls maplookup() internally to include any virtual columns. In this
release, querying flex views, or any table with a __raw__ column, also invokes
maplookup() internally to improve query results.
More Details
For more information see Integrating with Apache Kafka and Understanding Flex
Tables
Flex Parsers
Vertica 7.0.x introduces two new parsers.
Avro Parser
Use favroparser to load Avro files into flex and columnar tables.
For flex tables, this parser supports files with primitive and complex data types, as
described in the Apache Avro 1.3 specification.

CSV Parser
Use the flex tables CSV parser, fcsvparser, to load standard and modified CSV files
into flex and columnar tables.
For more information, see Flex Parsers Reference.
Vertica and Apache Kafka Integration
Vertica 7.0.x introduces the ability to integrate with Apache Kafka. This feature allows
you to stream data from a Kafka message bus directly into your Vertica database. It uses
the job scheduler feature included in the Vertica rpm package.
For more information, see Kafka Integration Guide.
Management Console
This section contains information on updates to the Management Console for Vertica
More Details
For more information see Management Console.
Connect to HPE IDOL Dashboard from Management Console
Vertica 7.0.x Management Console introduces the ability to create a mutual link between
your Vertica Management Console and HPE IDOL dashboards. The new HPE IDOL
button in MC displays the number of alerts you have in HPE IDOL. The button provides
a clickable shortcut to your HPE IDOL dashboard. Point MC to your HPE IDOL
dashboard in MC Settings.
For more information see Connecting to HPE IDOL Dashboard in Using Management
Console.
External Data Sources for Management Console Monitoring
You can now use Management Console (MC) to monitor Data Collector information
copied into Vertica tables, locally or remotely.
In the MC Settings page, provide mappings to local schemas or to an external database
that contains the corresponding DC data. MC can then render its charts and graphs from
the new repository instead of from local DC tables. This offers the benefit of loading
larger sets of data faster in MC, and retaining historical data long term.
Administrators can configure MC to monitor an external data source using the new Data
Source tab on the MC Settings page.

See Monitoring External Data Sources in Management Console.
Configure Resource Pools Using Management Console
In this release, Management Console introduces more ways to configure your resource
pools. With Management Console you can now create and remove resource pools,
assign resource pool users, and assign cascading pools.
Database administrators can make changes to a database's resource pools in the
Resource Pools Configuration page, accessible through the database's Settings page.
See Configuring Resource Pools in Management Console
Threshold Monitoring Enhancements in Management Console
Vertica Management Console 7.0.x introduces more detailed, configurable notifications
about the health of monitored databases.
Prioritize and customize your notifications and alert thresholds on the new Thresholds
tab, which appears on the Database Settings page. The new Threshold Settings widget
on the Overview Page now displays your prioritized alerts.
For more information, see Monitoring Database Messages in MC and Customizing
Message Thresholds.
Management Console Message Center Enhancements
In this release, Management Console introduces filtering and performance
enhancements to the Message Center that allow you to:
l View up to 10,000 messages by default
l Retrieve additional alerts from the past
l Use console.properties to increase the number of messages you can view in
Message Center
l Delete all your alerts at once
In addition, improvements to filtering now allow you to sort messages by severity,
database name, message description, and date.
For more information about messages in Management Console, see Monitoring
Database Messages in MC.
Management Console Rest API
Vertica now provides API calls to interact with Management Console.

l GET alerts
l GET alertSummary
Platform 7.0.x.
More Details
System Tables for Constraint Enforcement
These system tables, under the V_CATALOG schema, include the new column IS_
ENABLED:
l CONSTRAINT_COLUMNS
l TABLE_CONSTRAINTS
l PRIMARY_KEYS
Under the V_CATALOG schema, the PROJECTIONS system table includes the new
column IS_KEY_CONSTRAINT_PROJECTION.
RESOURCE_POOL_MOVE System Table
Vertica 7.2.x includes the following changes to the RESOURCE_POOL_MOVE system
table::
l The tables includes the MOVE_CAUSE column. This column displays the reason
why the query attempted to move.
l The CAP_EXCEEDED column was removed.
l The REASON column is now called RESULT_REASON.
For more information see RESOURCE_POOL_MOVE in the SQL Reference Manual.
LICENSES System Table
The system table LICENSES, under the V_CATALOG schema, includes new columns:

l LICENSETYPE
l PARENT
l CONFIGURED_ID
TABLE_RECOVERIES System Table
Vertica 7.0.x now includes the TABLE_RECOVERIES system table. You can query this
table to view detailed progress on specific tables during a Recovery By Table.
TABLE_RECOVERY_STATUS System Table
Vertica 7.0.x now includes the TABLE_RECOVERY_STATUS system table. You can
query this table to view the progress of a Recovery By Table.
More Details
For more information see the SQL Reference Manual.
Analytic Functions
Vertica now includes the NTH_VALUE analytic function.
NTH_VALUE is an analytic function that returns the value evaluated at the row that is
the nth row of the window (counting from 1).
For more information see Analytic Functions in SQL Reference Manual.
Math Functions
Vertica now includes the following mathematical functions:
l COSH—Calculates the hyperbolic cosine.
l LOG10—Calculates the base 10 logarithm.
l SINH—Calculates the hyperbolic sine.
l TANH—Calculates the hyperbolic tangent.

Options for Routing Queries
Vertica 7.0.x introduces new functionality for moving queries to different resource pools.
Now, as the database administrator, you can use the MOVE_STATEMENT_TO_
RESOURCE_POOL meta-function to specify that queries move to different resource
pools mid-execution.
For more information see Manually Moving Queries to Different Resource Pools.
Session Resource Functions
Vertica now includes new resource management functions:
l RESERVE_SESSION_RESOURCE
l RELEASE_SESSION_RESOURCE
Automatic Enforcement of Primary and Unique Key Constraints
Vertica can now automatically enforce primary and unique key constraints. Additionally,
you can enable individual constraints using CREATE TABLE or ALTER TABLE.
You also have the option of setting parameters so that new constraints you create are,
by default, disabled or enabled when you create them. If you have not specifically
enabled or disabled constraints using CREATE TABLE or ALTER TABLE, the
parameter default settings apply.
For information on automatic enforcement of PRIMARY and UNIQUE key constraints,
refer to Enforcing Primary and Unique Key Constraints Automatically in the
Administrator's Guide.
When you upgrade to Vertica7.0.x, the primary and unique key constraints in any tables
you carry over are disabled. Existing constraints are not automatically enforced. To
enable existing constraints and make them automatically enforceable, manually enable
each constraint using the ALTER TABLE ALTER CONSTRAINT statement. This
statement triggers constraint enforcement for the existing table contents. Statements roll
back if one or more violations occur.

Enabling and Disabling Individual Constraints
Two new modifiers allow you to set enforcement of individual constraints: ENABLED
and DISABLED.
To enable or disable individual constraints, use the CREATE TABLE or ALTER TABLE
statements. These syntaxes now include ENABLED and DISABLED options for
PRIMARY and UNIQUE keys:
l Column-Constraint (as part of the CREATE TABLE statement)
l Table-Constraint (as part of CREATE TABLE or ALTER TABLE statements)
The ALTER TABLE statement also includes an ALTER CONSTRAINT option for
enabling or disabling existing constraints.
Choosing Default Enforcement for Newly Declared or Modified Constraints
Two new parameters allow you to set the default for enabling or disabling newly created
constraints. You set these parameters using the ALTER DATABASE statement. Setting
a constraint as enabled or disabled when you create or alter it using CREATE TABLE
or ALTER TABLE overrides the parameter setting. The default value for both of these
new parameters is false (disabled).
l EnableNewPrimaryKeysByDefault lets you enable or disable constraints for
primary keys.
l EnableNewUniqueKeysByDefault lets you enable or disable constraints for unique
keys.
For general information about configuration parameters, refer to Configuration
Parameters in the Administrator's Guide. For information about these new parameters
and how to set them, refer to Constraint Enforcement Parameters in the Administrator's
Guide and ALTER DATABASE in the SQL Reference Manual.
Behavior Not Changed from Previous Releases
NOT NULL constraints are always automatically enforced for primary keys. When you
create a primary key, Vertica implicitly creates a NOT NULL constraint on the key set. It
does so regardless of whether you enable or disable the key.
You can manually validate constraints using ANALYZE_CONSTRAINTS meta-function.
ANALYZE_CONSTRAINTS does not depend upon nor does it consider the automatic

enforcement settings of primary or unique keys. Thus, you can run ANALYZE_
CONSTRAINTS on a table or schema that includes:
l Disabled key constraints
l A mixture of enabled and disabled key constraints
Recover by Table Functions
Vertica now includes new functions to configure and perform recovery on a per-table
basis:
l SET_RECOVER_BY_TABLE
l SET_TABLE_RECOVER_PRIORITY
Backup, Restore, and Recovery
This section contains information on updates to backup and restore operations for
More Details
For more information see Backing Up and Restoring the Database.
Backup to Local Host
Vertica does not support the Linux variable, localhost. However, it does allow you to
direct backups to the local host without using an IP address.
Direct a backup to a location on the localhost by including square brackets and a path in
the following form:
[Mapping]
NodeName = []:/backup/path
This example shows typical localhost mapping:
[Mapping]
v_node0001 = []:/scratch_drive/archive/backupdir
For more information see Types of Backups.

Restoring Individual Objects from a Full or Object-Level Backup
You can now restore individual tables or schemas from any backup that contains those
objects without restoring the entire backup. This option is useful if you only need to
restore a few objects and want to avoid the overhead of a larger scale restore. Your
database must be running and your nodes must be UP to restore individual objects.
For more information see Restoring Individual Objects from a Full or Object-Level
Backup.
Lightweight Partition Copy
Vertica now includes the COPY_PARTITIONS_TO_TABLE function.
Lightweight partition copy increases performance by sharing the same storage between
two tables. The storage footprint does not increase as a result of shared storage. After
the copy partition is complete, the tables are independent of each other. Users can
perform operations on each table without impacting the other. As the tables diverge, the
storage footprint may increase as a result of operations performed on these tables.
Object Restore Mode
You can now specify how Vertica should handle restored objects:
Object restore mode (coexist, createOrReplace or create)
(createOrReplace):
Vertica supports the following object restore modes:
l createOrReplace (default) — Vertica creates any objects that do not exist. If the object
does exist, vbr overwrites it with the version from the archive.
l create — Vertica creates any objects that do not exist. If an object being restored
does exist, Vertica displays an error message and skips that object.
l coexist — Vertica creates all restored objects with the form <backup>_
<timestamp>_<object_name>. This approach allows existing and restored objects
to exist simultaneously.
In all modes, Vertica restores data with the current epoch. Object restore mode settings
do not apply to backups and full restores.

For more information see Restoring Object-Level Backups.
Recovery by Table
Vertica now supports node recovery on a per-table basis. Unlike a node-based recovery,
recovering by table makes tables available as they recover, before the node itself is
completely restored. You can prioritize your most important tables so that they become
available as soon as possible. Recovered tables support all DDL and DML operations.
After a node fully recovers, it enables full Vertica functionality.
Recovery by table is enabled by default.
For more information see Recovery By Table and Prioritizing Table Recovery.
Hadoop Integration
This section contains information on updates to Hadoop-integration information for
More Details
For more information see Integrating with Hadoop.
Hadoop HDFS Connector
The HDFS Connector is now installed with Vertica; you no longer need to download and
install it separately. If you have previously downloaded and installed this connector,
uninstall it before you upgrade to this release of Vertica to get the newest version.
For more information see Using the HDFS Connector in Integrating with Hadoop.
Place
This section contains information on updates to Vertica Place for Vertica Analytics
Platform 7.0.x.
More Details
For more information see .
WGS84 Support
WGS84 support has been added to the following functions:
l STV_Intersect Transform Function
l STV_Intersect Scalar Function
l ST_Distance

Vertica Place Functions
The following new functions have been added to Vertica Place:
l STV_AsGeoJSON
l STV_ForceLHR
l STV_Reverse
STV_Refresh_Index Removes Polygons From Spatial Indexes
STV_Refresh_Index can now remove deleted polygons from spatial indexes. For more
information, see STV_Refresh_Index.
Vertica Pulse
This section contains information on updates to Pulse for Vertica Analytics Platform
7.0.x.
More Details
For more information see Vertica Pulse.
Action Patterns
Vertica Pulse now supports the use of action patterns in white_list dictionaries. An
action pattern enables Pulse to recognize phrases that denote action, intention, or
interest, such as going to buy, waiting to see, and so on. Action patterns can identify
behaviors associated with your sentiment analysis terms.
Action patterns can:
l Connect Word Forms to a Root Word — Vertica Pulse lemmatizes all words.
Lemmatization recognizes different word forms and maps them to the root word. For
example, Pulse would map bought and buying to buy. This ability extends to
misspellings. For example, tryiiiing and seeeeeing taaablets would map to trying and
seeing tablets.
l Create Object-Specific Queries — To identify only the attributes that are objects of
action patterns, create a whitelist dictionary that contains only action patterns of
interest. In your sentiment analysis query set the actionPattern and whiteListOnly
parameters to true.

Concurrent User-Defined Dictionaries
In version 7.0.x and later, users can apply dictionaries on a per-user basis. Any number
of Pulse users can concurrently apply different sets of dictionaries without conflicts and
without disrupting the sessions of other users. Each user can have one dictionary of
each type loaded at any given time. If a user does not specify a dictionary of a given
type, Pulse uses the default dictionary for that type.
For more information see Dictionary and Mapping Labels in Vertica Pulse.
Case-Sensitive Sentiment Analysis
By default, Pulse is case insensitive. ERROR produces the same results as error. You
can now specify a case setting for a single word using the $Case parameter. For
example, to identify Apple, rather than apple, you would add the following:
=> INSERT INTO pulse.white_list_en VALUES('$Case(Apple)');
=> COMMIT;
For more information see Sentiment Analysis Levels in Vertica Pulse.
Dictionary And Mapping Labels
You can apply a label to any user-defined dictionary or mapping when you load that
object. Labels enable to you perform sentiment analysis against a predetermined set of
dictionaries and mappings without having to specify a list of dictionaries. For example,
you might have a set of dictionaries labeled "music" and a set labeled "movies." The
default user dictionaries automatically have a label of "default."
A single dictionary or mapping can have multiple labels. For example, you might label a
white list of artists as both "painters" and "renaissance." You could load the dictionary by
loading either label. A label can only apply to one dictionary of each type. For example,
you cannot have two dictionaries of stop words that share the same label. If you apply a
label to multiple dictionaries of the same type, Pulse uses the most recently applied
label.
You can view the labels associated with your current dictionaries using the
GetAllLoadedDictionaries() function. You can also view the label associated with your
current mapping using the GetLoadedMapping() function.
For more information, see Dictionaries and Mappings in Vertica Pulse.
Pulse Functions
Vertica 7.0.x includes the following new functions:

l UnloadLabeledDictionary()
l UnloadLabeledDictionarySet()
l UnloadLabeledMapping()
SDK Updates
This section contains information on updates to the SDK for Vertica Analytics Platform
7.0.x.
More Details
SDK Enhancements
Vertica 7.0.x introduces these enhancements to the Java SDK:
l A new class named StringUtil helps you manipulate string data. See The
StringUtils Class for more information.
l The new PartitionWriter.setStringBytes method lets you set the value of
BINARY, VARBINARY, and LONG VARBINARY columns using a ByteBuffer
object. See the Java UDx API documentation for more details.
l The PartitionWriter class has a new set of methods for writing output including
setLongValue, setStringValue, and setBooleanValue. These methods set the
output column value to NULL when passed a Java null reference. When you pass
these methods a value, they save the value in the column. These methods save you
the steps of checking for null references and calling the separate methods to store
nulls or values in the columns. For more information, see the entry for
PartitionWriter in the Java API documentation.
l The StreamWriter class used with a User-Defined Parser has a new method,
setRowFromMap. You can use this method to write a map of column-name/value pairs
as a single operation with automatic type coercion. The JSON Parser example
demonstrates this method. For more information, see UDParser and ParserFactory
Java Interface, particularly the section titled "Writing Data".

l User-Defined Analytic Functions can now be written in Java in addition to C++. See
Developing a User-Defined Analytic Function in Java.
l Multi-phase transform functions can now be written in Java in addition to C++. See
Creating Multi-Phase UDTFs.
UDx Wildcards
Vertica now supports wildcard * characters in the place of column names in user-defined
functions.
You can use wildcards when:
l Your query contains a table in the FROM clause
l You are using a Vertica-supported development language
l Your UDx is running in fenced or unfenced mode
For more information see Using User-Defined Extensions.
User-Defined Session Parameters
This release adds support for user-defined session parameters. Vertica now supports
passing session parameters to a Java or C++ UDx at construction time.
For more information, see User-Defined Session Parameters and User-Defined Session
Parameters in Java.
Documentation Updates
This section contains information on updates to the product documentation for Vertica
More Details
For complete product documentation see ® Documentation.
Documentation Changes
The following changes are effective for Vertica 7.0.x.

Document Additions and Revisions
The following additions and revisions have been made to the Vertica 7.0.x product
documentation.
l Extending Vertica has been reorganized to reduce redundancy and make information
easier to find. Specifically:
n The documentation for each type of User-Defined Extension first presents
information that is true for all implementation languages and then presents
language-specific information for C++ and Java.
n All information about developing UDxs, whether general (such as packaging
libraries) or about specific APIs, is now found under Developing User-Defined
Extensions (UDxs).
n The documentation for each type of UDx follows the same structure: requirements,
class overview, any implementation topics related specifically to that type of UDx,
deploying, and examples.
n Developing UDSFs and UDTFs in R remains a separate section.
l Vertica Concepts has been moved in the Table of Contents. It now appears after New
Features and before Installing Vertica. It also has been reorganized and updated to
reflect improvements to Vertica.
l The contents of the book Vertica for SQL on Hadoop have been folded into
Integrating with Hadoop, which now explains both co-located and separate clusters.
The book has been reorganized to make information easier to find and eliminate
redundancies. The ORC Reader has been made more prominent.
For more information, see Integrating with Hadoop, and in particular Cluster Layout
and Choosing Which Hadoop Interface to Use.
l A new Security and Authentication guide consolidates all client/server authentication
and security topics. Authentication information was removed from the Administrator's
Guide and placed in the new document.
l Creation of a standalone Management Console guide. Management Console topics
have been removed from the Administrator's Guide. Management Console topics

remain in Installing Vertica (Installing and Configuring Management Console) and
Getting Started (Using Management Console ).
l A new Hints section in the SQL Reference Manual describes all supported query
hints.
l A new Best Practices for DC Table Queries section in Machine Learning for
Predictive Analytics describes how to optimize query performance when querying
Data Collector Tables.
l A new Integrating with Apache Kafka document describes how to integrate Apache
Kafka with Vertica.
l Documentation for the Microsoft Connectivity Pack now resides in the Connecting to
Vertica document. See The Microsoft Connectivity Pack for Windows.
Removed from Documentation
The following documentation elements were removed from the Vertica 7.0.x product
documentation.
l Documentation on partially sorted GROUPBY has been removed.
l The VSQL environment variables page (vsql Environment Variables) listed VSQL_
DATABASE and SHELL. These environment variables are no longer in use and have
been removed from the documentation.
l The previously deprecated function MERGE_PARTITIONS was removed from the SQL
Reference Manual.
Deprecated and Retired Functionality
This section describes the two phases HPE follows to retire Vertica functionality:
l Deprecated. Vertica announces deprecated features and functionality in a major or
minor release. Deprecated features remain in the product and are functional.
Documentation is included in the published release documentation. Accessing the
feature can result in informational messages noting that the feature will be removed in

the following major or minor release. Vertica identifies deprecated features in this
document.
l Removed. HPE removes a feature in the major or minor release immediately
following the deprecation announcement. Users can no longer access the
functionality. Vertica announces all feature removal in this document. Documentation
describing the retired functionality is removed, but remains in previous documentation
versions.
Deprecated Functionality in This Release
In version 7.2., the following Vertica functionality has been deprecated.
l System-level parameter ConstraintsEnforcedExternally, and related SQL
statement SET SESSION CONSTRAINTS_ENFORCED_EXTERNALLY
l The backup and restore configuration parameter overwrite is replaced by the
objectRestoreMode setting.
l Support for any Vertica Analytics Platform running on the ext3 file system
l Prejoin projections
l Buddy projections with different sort order
l The --compat21 option of the admintools command
See Also
For a description of how Vertica deprecates features and functionality, see Deprecated
and Retired Functionality.

Retired Functionality History
The following functionality has been deprecated or removed in the indicated versions:
Functionality Component Deprecated
Version
Removed
Version
Version 6.0 vbr configuration mapping Server 7.2
Backup and restore overwrite
configuration parameter
Server 7.2
Prejoin projections Server 7.2
Buddy projections with different sort order Server 7.2
verticaConfig vbr configuration option Server 7.1
JavaClassPathForUDx configuration
parameter
Server 7.1
ADD_LOCATION() Server 7.1
bwlimit Server 7.1
Geospatial Package SQL Functions
l BB_WITHIN
l BEARING
l CHORD_TO_ARC
l DWITHIN
l ECEF_CHORD
l ECEF_x
l ECEF_y
l ECEF_z
l ISLEFT
Server 7.1 7.2

Version
Removed
Version
l KM2MILES
l LAT_WITHIN
l LL_WITHIN
l LLD_WITHIN
l LON_WITHIN
l MILES2KM
l RADIUS_LON
l RADIUS_M
l RADIUS_N
l RADIUS_R
l RADIUS_Ra
l RADIUS_Rc
l RADIUS_Rv
l RADIUS_SI
l RAYCROSSING
l WGS84_a
l WGS84_b
l WGS84_e2
l WGS84_f
l WGS84_if

Version
Removed
Version
l WGS84_r1
EXECUTION_ENGINE_PROFILES
counters file handles, memory allocated,
and memory reserved
Server 7.0
MERGE_PARTITIONS() Server 7.0
Administration Tools option check_spread Server,
clients
7.0
krb5 client authentication method All clients 7.0
Pload Library Server 7.0
USE SINGLE TARGET Server 7.0 7.1
scope parameter of CLEAR_PROFILING Server 6.1
IMPLEMENT_TEMP_DESIGN() Server,
clients
6.1
USER_TRANSFORMS user table Server 6.0
UPDATE privileges on sequences Server 6.0
Query Repository, which includes:
SYS_DBA.QUERY_REPO table
Functions:
l CLEAR_QUERY_REPOSITORY()
l SAVE_QUERY_REPOSITORY()
Configuration parameters:
l CleanQueryRepoInterval
l QueryRepoMemoryLimit
Server 6.0

Version
Removed
Version
l QueryRepoRetentionTime
l QueryRepositoryEnabled
l SaveQueryRepoInterval
l QueryRepoSchemaName
l QueryRepoTableName
See Notes section below table.
RESOURCE_ACQUISITIONS_HISTORY
system table
Server 6.0
Volatility and NULL behavior parameters of
CREATE FUNCTION
Server 6.1
Ganglia on Red Hat 4 Server 6.0
copy_vertica_database.sh Server
restore.sh Server
backup.sh Server
LCOPY (see Note section below table) Server,
clientsw
4.1 (Client)
5.1 (Server)
5.1
(Client)
MergeOutPolicySizeList Server 4.1 5.0
EnableStrataBasedMrgOutPolicy Server 4.1 5.0
ReportParamSuccess All clients 4.1 5.0
BatchAutoComplete All clients 4.1 5.0
use35CopyParameters ODBC,
JDBC
4.1 5.0

Version
Removed
Version
clients
getNumAcceptedRows
getNumRejectedRows
ODBC,
JDBC
clients
5.0
MANAGED load (server keyword and
related client parameter)
Server,
clients
5.0
EpochAdvancementMode Server 4.1 5.0
VT_ tables Server 4.1 5.0
RefreshHistoryDuration Server 4.1 5.0
Notes
l While the Vertica Geospatial package has been deprecated, it has been replaced by
Vertica Place. This analytics package is available on my.vertica.com/downloads.
l LCOPY: Supported by the 5.1 server to maintain backwards compatibility with the 4.1
client drivers.
l Query Repository: You can still monitor query workloads with the following system
tables:
n QUERY_PROFILES
n SESSION_PROFILES
n EXECUTION_ENGINE_PROFILES
In addition, Vertica Version 6.0 introduced new robust, stable workload-related
system table:
l QUERY_REQUESTS

l QUERY_EVENTS
l RESOURCE_ACQUISITIONS
l The RESOURCE_ACQUISITIONS system table captures historical information.
l Use the Kerberos gss method for client authentication, instead of krb5. See
Configuring Kerberos Authentication.

Installing Vertica

Installation Overview and Checklist
This page provides an overview of installation tasks. Carefully review and follow the
instructions in all sections in this topic.
Important Notes
l Vertica supports only one running database per cluster.
l Vertica supports installation on one, two, or multiple nodes. The steps for Installing
Vertica are the same, no matter how many nodes are in the cluster.
l Prerequisites listed in Before You Install Vertica are required for all Vertica
configurations.
l Only one instance of Vertica can be running on a host at any time.
l To run the install_vertica script, as well as adding, updating, or deleting nodes,
you must be logged in as root, or sudo as a user with all privileges. You must run the
script for all installations, including upgrades and single-node installations.
Installation Scenarios
The four main scenarios for installing Vertica on hosts are:
l A single node install, where Vertica is installed on a single host as a localhost
process. This form of install cannot be expanded to more hosts later on and is
typically used for development or evaluation purposes.
l Installing to a cluster of physical host hardware. This is the most common scenario
when deploying Vertica in a testing or production environment.
l Installing on Amazon Web Services (AWS). When you choose the recommended
Amazon Machine Image (AMI), Vertica is installed when you create your instances.
For the AWS specific installation procedure, see Installing and Running Vertica on
AWS: The Detailed Procedure rather than the using the steps for installation and
upgrade that appear in this guide.
l Installing to a local cluster of virtual host hardware. Also similar to installing on
physical hosts, but with network configuration differences.

Before You Install
Before You Install Vertica describes how to construct a hardware platform and prepare
Linux for Vertica installation.
These preliminary steps are broken into two categories:
l Configuring Hardware and Installing Linux
l Configuring the Network
Install or Upgrade Vertica
Once you have completed the steps in the Before You Install Vertica section, you are
ready to run the install script.
Installing Vertica describes how to:
l Back up any existing databases.
l Download and install the Vertica RPM package.
l Install a cluster using the install_vertica script.
l [Optional] Create a properties file that lets you install Vertica silently.
Note: This guide provides additional manual procedures in case you encounter
installation problems.
l Upgrading Vertica to a New Version describes the steps for upgrading to a more
recent version of the software.
Note: If you are upgrading your Vertica license, refer to Managing Licenses in the
Post-Installation Tasks
After You Install Vertica describes subsequent steps to take after you've run the
installation script. Some of the steps can be skipped based on your needs:
l Install the license key.
l Verify that kernel and user parameters are correctly set.

l Install the vsql client application on non-cluster hosts.
l Resolve any SLES 11.3 issues during spread configuration.
l Use the Vertica documentation online, or download and install Vertica
documentation. Find the online documentation and documentation packages to
download at http://my.vertica.com/docs.
l Install client drivers.
l Extend your installation with Vertica packages.
l Install or upgrade the Management Console.
Get started!
l Read the Concepts Guide for a high-level overview of the HPE Vertica Analytics
Platform.
l Proceed to the Installing and Connecting to the VMart Example Database in Getting
Started, where you will be guided through setting up a database, loading sample
data, and running sample queries.

of 5309HPE Vertica Analytics Platform (7.2.x)

About Linux Users Created by Vertica and Their
Privileges
This topic describes the Linux accounts that the installer creates and configures so
Vertica can run. When you install Vertica, the installation script optionally creates the
following Linux user and group:
l dbadmin—Administrative user
l verticadba—Group for DBA users
dbadmin and verticadba are the default names. If you want to change what these Linux
accounts are called, you can do so using the installation script. See Installing Vertica
with the install_vertica Script for details.
Before You Install Vertica
See the following topics for more information:
l Installation Overview and Checklist
l General Hardware and OS Requirements and Recommendations
When You Install Vertica
The Linux dbadmin user owns the database catalog and data storage on disk. When
you run the install script, Vertica creates this user on each node in the database cluster.
It also adds dbadmin to the Linux dbadmin and verticadba groups, and configures the
account as follows:
l Configures and authorizes dbadmin for passwordless SSH between all cluster
nodes. SSH must be installed and configured to allow passwordless logins. See
Enable Secure Shell (SSH) Logins.
l Sets the dbadmin user's BASH shell to /bin/bash, required to run scripts, such as
install_vertica and the Administration Tools.
l Provides read-write-execute permissions on the following directories:

n /opt/vertica/*
n /home/dbadmin—the default directory for database data and catalog files
(configurable through the install script)
Note: The Vertica installation script also creates a Vertica database superuser
named dbadmin. They share the same name, but they are not the same; one is a
Linux user and the other is a Vertica user. See Database Administration User in the
Administrator's Guide for information about the database superuser.
After You Install Vertica
Root or sudo privileges are not required to start or run Vertica after the installation
process completes.
The dbadmin user can log in and perform Vertica tasks, such as creating a database,
installing/changing the license key, or installing drivers. If dbadmin wants database
directories in a location that differs from the default, the root user (or a user with sudo
privileges) must create the requested directories and change ownership to the dbadmin
user.
Vertica prevents administration from users other than the dbadmin user (or the user
name you specified during the installation process if not dbadmin). Only this user can
run Administration Tools.
See Also
l Installation Overview and Checklist
l Before You Install Vertica
l Platform Requirements and Recommendations
l Enable Secure Shell (SSH) Logins

Before You Install Vertica
Complete all of the tasks in this section before you install Vertica. When you have
completed this section, proceed to Installing Vertica.

Platform Requirements and Recommendations
You must verify that your servers meet the platform requirements described in
Supported Platforms. The Supported Platforms topics detail supported versions for the
following:
l OS for Server and Management Console (MC)
l Supported Browsers for MC
l Vertica driver compatibility
l R
l Hadoop
l Various plug-ins
BASH Shell
All shell scripts included in Vertica must run under the BASH shell. If you are on a
Debian system, then the default shell can be DASH. DASH is not supported. Change
the shell for root and for the dbadmin user to BASH with the chsh command.
For example:
# getent passwd | grep root
root:x:0:0:root:/root:/bin/dash
# chsh
Changing shell for root.
New shell [/bin/dash]: /bin/bash
Shell changed.
Then, as root, change the symbolic link for /bin/sh from /bin/dash to /bin/bash:
# rm /bin/sh
# ln -s /bin/bash /bin/sh
Log out and back in for the change to take effect.
Install the Latest Vendor Specific System Software
Install the latest vendor drivers for your hardware. For HPE Servers, update to the latest
versions for:

l HP ProLiant Smart Array Controller Driver (cciss)
l Smart Array Controller Firmware
l HP Array Configuration Utility (HP ACU CLI)
Data Storage Recommendations
l All internal drives connect to a single RAID controller.
l The RAID array should form one hardware RAID device as a contiguous /data
volume.
Validation Utilities
Vertica provides several validation utilities that validate the performance on prospective
hosts. The utilities are installed when you install the Vertica RPM, but you can use them
before you run the install_vertica script. See Validation Scripts for more details on
running the utilities and verifying that your hosts meet the recommended requirements.
General Hardware and OS Requirements and Recommendations
Hardware Recommendations
The HPE Vertica Analytics Platform is based on a massively parallel processing (MPP),
shared-nothing architecture, in which the query processing workload is divided among
all nodes of the Vertica database. HPE highly recommends using a homogeneous
hardware configuration for your Vertica cluster; that is, each node of the cluster should
be similar in CPU, clock speed, number of cores, memory, and operating system
version.
Note that HPE has not tested Vertica on clusters made up of nodes with disparate
hardware specifications. While it is expected that an Vertica database would
functionally work in a mixed hardware configuration, performance will most certainly be
limited to that of the slowest node in the cluster.
Detailed hardware recommendations are available in the Vertica Hardware Planning
Guide.
Platform OS Requirements
Important! Deploy Vertica as the only active process on each host—other than Linux
processes or software explicitly approved by Vertica. Vertica cannot be colocated with
other software. Remove or disable all non-essential applications from cluster hosts.

You must verify that your servers meet the platform requirements described in Vertica
Server and Vertica Management Console.
Verify Sudo
Vertica uses the sudo command during installation and some administrative tasks.
Ensure that sudo is available on all hosts with the following command:
# which sudo
/usr/bin/sudo
If sudo is not installed, browse to the Sudo Main Page and install sudo on all hosts.
When you use sudo to install Vertica, the user that performs the installation must have
privileges on all nodes in the cluster.
Configuring sudo with privileges for the individual commands can be a tedious and
error-prone process; thus, the Vertica documentation does not include every possible
sudo command that you can include in the sudoers file. Instead, HPE recommends that
you temporarily elevate the sudo user to have all privileges for the duration of the install.
Note: See the sudoers and visudo man pages for the details on how to write/modify
a sudoers file.
To allow root sudo access on all commands as any user on any machine, use visudo as
root to edit the /etc/sudoers file and add this line:
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
After the installation completes, remove (or reset) sudo privileges to the pre-installation
settings.

Prepare Disk Storage Locations
You must create and specify directories in which to store your catalog and data files
(physical schema). You can specify these locations when you install or configure the
database, or later during database operations. Both the catalog and data directories
must be owned by the database administrator.
The directory you specify for database catalog files (the catalog path) is used across all
nodes in the cluster. For example, if you specify /home/catalog as the catalog directory,
Vertica uses that catalog path on all nodes. The catalog directory should always be
separate from any data file directories.
Note: Do not use a shared directory for more than one node. Data and catalog
directories must be distinct for each node. Multiple nodes must not be allowed to
write to the same data or catalog directory.
The data path you designate is also used across all nodes in the cluster. Specifying that
data should be stored in /home/data, Vertica uses this path on all database nodes.
Do not use a single directory to contain both catalog and data files. You can store the
catalog and data directories on different drives, which can be either on drives local to
the host (recommended for the catalog directory) or on a shared storage location, such
as an external disk enclosure or a SAN.
Before you specify a catalog or data path, be sure the parent directory exists on all
nodes of your database. Creating a database in admintools also creates the catalog and
data directories, but the parent directory must exist on each node.
You do not need to specify a disk storage location during installation. However, you can
do so by using the --data-dir parameter to the install_vertica script. See
Specifying Disk Storage Location During Installation
See Also
l Specifying Disk Storage Location on MC
l Specifying Disk Storage Location During Database Creation
l Configuring Disk Usage to Optimize Performance
l Using Shared Storage With Vertica

Disk Space Requirements for Vertica
In addition to actual data stored in the database, Vertica requires disk space for several
data reorganization operations, such as mergeout and managing nodes in the cluster.
For best results, HPE recommends that disk utilization per node be no more than sixty
percent (60%) for a K-Safe=1 database to allow such operations to proceed.
In addition, disk space is temporarily required by certain query execution operators,
such as hash joins and sorts, in the case when they cannot be completed in memory
(RAM). Such operators might be encountered during queries, recovery, refreshing
projections, and so on. The amount of disk space needed (known as temp space)
depends on the nature of the queries, amount of data on the node and number of
concurrent users on the system. By default, any unused disk space on the data disk can
be used as temp space. However, HPE recommends provisioning temp space separate
from data disk space. See Configuring Disk Usage to Optimize Performance.

Configuring the Network
This group of steps involve configuring the network. These steps differ depending on
your installation scenario. A single node installation requires little network configuration,
since the single instance of the Vertica server does not need to communication with
other nodes in a cluster. For cluster and cloud install scenarios, you must make several
decisions regarding your configuration.
Vertica supports server configuration with multiple network interfaces. For example, you
might want to use one as a private network interface for internal communication among
cluster hosts (the ones supplied via the --hosts option to install_vertica) and a
separate one for client connections.
Important: Vertica performs best when all nodes are on the same subnet and have
the same broadcast address for one or more interfaces. A cluster that has nodes on
more than one subnet can experience lower performance due to the network latency
associated with a multi-subnet system at high network utilization levels.
Important Notes
l Network configuration is exactly the same for single nodes as for multi-node clusters,
with one special exception. If you install Vertica on a single host machine that is to
remain a permanent single-node configuration (such as for development or Proof of
Concept), you can install Vertica using localhost or the loopback IP (typically
127.0.0.1) as the value for --hosts. Do not use the hostname localhost in a node
definition if you are likely to add nodes to the configuration later.
l If you are using a host with multiple network interfaces, configure Vertica to use the
address which is assigned to the NIC that is connected to the other cluster hosts.
l Use a dedicated gigabit switch. If you do not performance could be severely affected.
l Do not use DHCP dynamically-assigned IP addresses for the private network. Use
only static addresses or permanently-leased DHCP addresses.
Optionally Run Spread on Separate Control Network
If your query workloads are network intensive, you can use the --control-network
parameter with the install_vertica script (see Installing Vertica with the install_
vertica Script) to allow spread communications to be configured on a subnet that is
different from other Vertica data communications.

The --control-network parameter accepts either the default value or a broadcast
network IP address (for example, 192.168.10.255 ).
Configure SSH
l Verify that root can use Secure Shell (SSH) to log in (ssh) to all hosts that are
included in the cluster. SSH (SSH client) is a program for logging into a remote
machine and for running commands on a remote machine.
l If you do not already have SSH installed on all hosts, log in as root on each host and
install it before installing Vertica. You can download a free version of the SSH
connectivity tools from OpenSSH.
l Make sure that /dev/pts is mounted. Installing Vertica on a host that is missing the
mount point /dev/pts could result in the following error when you create a database:
TIMEOUT ERROR: Could not login with SSH. Here is what SSH said:Last login: Sat Dec 15 18:05:35 2007
from node01
Allow Passwordless SSH Access for the Dbadmin User
The dbadmin user must be authorized for passwordless ssh. In typical installs, you won't
need to change anything; however, if you set up your system to disallow passwordless
login, you'll need to enable it for the dbadmin user. See Enable Secure Shell (SSH)
Logins.
Ensure Ports Are Available
Verify that ports required by Vertica are not in use by running the following command as
the root user and comparing it with the ports required in Firewall Considerations
below:
netstat -atupn
If you are using a Red Hat 7/CentOS 7 system, use the following command instead:
ss -atupn
Firewall Considerations
Vertica requires several ports to be open on the local network. Vertica does not
recommend placing a firewall between nodes (all nodes should be behind a firewall),
but if you must use a firewall between nodes, ensure the following ports are available:

Port Protocol Service Notes
22 TCP sshd Required by Administration Tools
and the Management Console
Cluster Installation wizard.
5433 TCP Vertica Vertica client (vsql, ODBC,
JDBC, etc) port.
5434 TCP Vertica Intra- and inter-cluster
communication. Vertica opens
the Vertica client port +1 (5434 by
default) for intra-cluster
communication, such as during a
plan. If the port +1 from the
default client port is not available,
then Vertica opens a random port
for intra-cluster communication.
5433 UDP Vertica Vertica spread monitoring.
5444 TCP Vertica
Management Console
MC-to-node and node-to-node
(agent) communications port.
See Changing MC or Agent
Ports.
5450 TCP Vertica
Management Console
Port used to connect to MC from
a web browser and allows
communication from nodes to the
MC application/web server. See
Connecting to Management
Console.
4803 TCP Spread Client connections.
4803 UDP Spread Daemon to Daemon connections.
6543 UDP Spread Monitor to Daemon connection.

Operating System Configuration Task Overview
This topic provides a high-level overview of the OS settings required for Vertica. Each
item provides a link to additional details about the setting and detailed steps on making
the configuration change. The installer tests for all of these settings and provides hints,
warnings, and failures if the current configuration does not meet Vertica requirements.
Before You Install the Operating System
Configuration Description
Supported
Platforms
Verify that your servers meet the platform requirements described in
Vertica 7.2. Supported Platforms. Unsupported operating systems
are detected by the installer.
LVM Linux Logical Volume Manager (LVM) is not supported on partitions
that contain Vertica files.
Filesystem The filesystem for the Vertica data and catalog directories must be
formatted as ext3 or ext4.
Swap Space A 2GB swap partition is required. Partition the remaining disk space
in a single partition under "/".
Disk Block
Size
The disk block size for the Vertica data and catalog directories
should be 4096 bytes (the default for ext3 and ext4 filesystems).
Memory For more information on sizing your hardware, see the Vertica
Hardware Planning Guide.
Firewall/Ports Firewalls, if present, must be configured so as not to interfere with
Vertica.
General Operating System Configuration - Automatically
Configured by Installer
These general OS settings are automatically made by the installer if they do not meet
Vertica requirements. You can prevent the installer from automatically making these

configuration changes by using the --no-system-configuration parameter for the
install_vertica script.
Nice Limits The database administration user must be able to nice processes
back to the default level of 0.
min_free_
kbytes
The vm.min_free_kbytes setting in /etc/sysctl.conf must be
configured sufficiently high. The specific value depends on your
hardware configuration.
User Open
Files Limit
The open file limit for the dbadmin user should be at least 1 file open
per MB of RAM, 65536, or the amount of RAM in MB; whichever is
greater.
System Open
File Limits
The maximum number of files open on the system must not be less
than at least the amount of memory in MB, but not less than 65536.
Pam Limits /etc/pam.d/su must contain the line:
session required pam_limits.so
This allows for the conveying of limits to commands run with the su
- command.
Address
Space Limits
The address space limits (as setting) defined in
/etc/security/limits.conf must be unlimited for the database
administrator.
File Size
Limits
The file sizelimits (fsize setting) defined in /etc/security/limits.conf
must be unlimited for the database administrator.
User Process
Limits
The nproc setting defined in /etc/security/limits.conf must be 1024 or
the amount of memory in MB, whichever is greater.
Maximum
Memory Maps
The vm.max_map_count in /etc/sysctl.conf must be 65536 or the
amount of memory in KB / 16, whichever is greater.
General Operating System Configuration - Manual Configuration
The following general OS settings must be done manually.

Disk
Readahead
This disk readahead must be at least 2048. The specific value
depends on your hardware configuration.
NTP Services The NTP daemon must be enabled and running.
chrony For Red Hat 7 and CentOS 7 systems, chrony must be enabled and
running.
SELinux SElinux must be disabled or run in permissive mode.
CPU
Frequency
Scaling
Vertica recommends that you disable CPU Frequency Scaling.
Important: Your systems may use significantly more energy
when CPU frequency scaling is disabled.
Transparent
Hugepages
For Red Hat 7 and CentOS 7 systems, Transparent Hugepages
must be set to always.
For all other operating systems, Transparent Hugepages must be
disabled or set to madvise.
I/O Scheduler The I/O Scheduler for disks used by Vertica must be set to deadline
or noop.
Support Tools Several optional packages can be installed to assist Vertica support
when troubleshooting your system.
System User Requirements
The following tasks pertain to the configuration of the system user required by Vertica.
Configuration Required Setting(s)
System User
Requirements
The installer automatically creates a user with the correct settings. If
you specify a user with --dba-user, then the user must conform to
the requirements for the Vertica system user.
LANG
Environment
Settings
The LANG environment variable must be set and valid for the
database administration user.

Configuration Required Setting(s)
TZ
Environment
Settings
The TZ environment variable must be set and valid for the database
administration user.
Before You Install The Operating System
The topics in this section detail system settings that must be configured when you install
the operating system. These settings cannot be easily changed after the operating
system is installed.
Supported Platforms
The Vertica installer checks the type of operating system that is installed. If the operating
system does not meet one of the supported operating systems (See Vertica Server and
Vertica Management Console), or the operating system cannot be determined, then
the installer halts.
The installer generates one of the following issue identifiers if it detects an unsupported
operating system:
l [S0320] - Fedora OS is not supported.
l [S0321] - The version of Red Hat/CentOS is not supported.
l [S0322] - The version of Ubuntu/Debian is not supported.
l [S0323] - The operating system could not be determined. The unknown operating
system is not supported because it does not match the list of supported operating
systems.
LVM Warning
Vertica does not support LVM (Logical Volume Manager) on any drive where database
(catalog and data) files are stored. The installer reports this issue with the identifier:
S0170.
On Red Hat 7/CentOS 7 systems, you are given four partitioning scheme
options: Standard Partition, BTRFS, LVM, and LVM Thin Provisioning. Set the
partitioning scheme to Standard Partition.
Filesystem Requirement
Vertica requires that your Linux filesystem be either ext3 or ext4. All other filesystem
types are unsupported. The installer reports this issue with the identifier S0160.

Swap Space Requirements
Vertica requires at least 2 GB swap partition regardless of the amount of RAM installed
on your system. The installer reports this issue with identifier S0180.
For typical installations Vertica recommends that you partition your system with a 2GB
primary partition for swap regardless of the amount of installed RAM. Larger swap space
is acceptable, but unnecessary.
Note: Do not place a swap file on a disk containing the Vertica data files. If a host
has only two disks (boot and data), put the swap file on the boot disk.
If you do not have at least a 2 GB swap partition then you may experience performance
issues when running Vertica.
You typically define the swap partition when you install Linux. See your platform’s
documentation for details on configuring the swap partition.
Disk Block Size Requirements
Vertica recommends that the disk block size be 4096 bytes, which is generally the
default on ext3 and ext4 filesystems. The installer reports this issue with the identifier
S0165.
The disk block size is set when you format your file system. Changing the block size
requires a re-format.
Memory Requirements
Vertica requires, at a minimum, 1GB of RAM per logical processor. The installer reports
this issue with the identifier S0190.
However, for performance reasons, you typically require more RAM than the minimum.
For more information on sizing your hardware, see the Vertica Hardware Planning
Guide.
Vertica requires multiple ports be open between nodes. You may use a firewall (IP
Tables) on Redhat/CentOS and Ubuntu/Debian based systems. Note that firewall use is
not supported on SuSE systems and that SuSE systems must disable the firewall. The
installer reports issues found with your IP tables configuration with the identifiers N0010
for (systems that use IP Tables) and N011 (for SuSE systems).
The installer checks the IP tables configuration and issues a warning if there are any
configured rules or chains. The installer does not detect if the configuration may conflict

with Vertica. It is your responsibility to verify that your firewall allows traffic for Vertica as
described in Ensure Ports Are Available.
Note: The installer does not check NAT entries in iptables.
You can modify your firewall to allow for Vertica network traffic, or you can disable the
firewall if your network is secure. Note that firewalls are not supported for Vertica
systems running on SuSE.
Red Hat 6 and CentOS 6 Systems
For details on how to configure iptables and allow specific ports to be open, see the
platform-specific documentation for your platform:
l RedHat: https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_
Linux/6/html/Security_Guide/sect-Security_Guide-IPTables.html
l CentOS: http://wiki.centos.org/HowTos/Network/IPTables
To disable iptables, run the following command as root or sudo:
# service iptables save
# service iptables stop
# chkconfig iptables off
To disable iptables if you are using the ipv6 versions of iptables, run the following
command as root or sudo:
# service ip6tables save
# service ip6tables stop
# chkconfig ip6tables off
Red Hat 7 and CentOS 7 Systems:
To disable the system firewall, run the following command as root or sudo:
# systemctl mask firewalld
# systemctl disable firewalld
# systemctl stop firewalld
Ubuntu and Debian Systems
For details on how to configure iptables and allow specific ports to be open, see the
platform-specific documentation for your platform:
l Debian: https://wiki.debian.org/iptables
l Ubuntu: https://help.ubuntu.com/12.04/serverguide/firewall.html.

Note: Ubuntu uses the ufw program to manage iptables.
To disable iptables on Debian, run the following command as root or sudo:
/etc/init.d/iptables stop
update-rc.d -f iptables remove
To disable iptables on Ubuntu, run the following command:
sudo ufw disable
SuSE Systems
The firewall must be disabled on SUSE systems. To disable the firewall on SuSE
systems, run the following command:
/sbin/SuSEfirewall2 off
Port Availability
The install_vertica script checks that required ports are open and available to Vertica.
The installer reports any issues with the identifier: N0020.
Port Requirements
The following table lists the ports required by Vertica.
22 TCP sshd Required by Administration Tools
and the Management Console
Cluster Installation wizard.
5433 TCP Vertica Vertica client (vsql, ODBC,
JDBC, etc) port.
5434 TCP Vertica Intra- and inter-cluster
communication. Vertica opens
the Vertica client port +1 (5434 by
default) for intra-cluster
communication, such as during a
plan. If the port +1 from the
default client port is not available,
then Vertica opens a random port

for intra-cluster communication.
5433 UDP Vertica Vertica spread monitoring.
5444 TCP Vertica
Management Console
MC-to-node and node-to-node
(agent) communications port.
See Changing MC or Agent
Ports.
5450 TCP Vertica
Management Console
Port used to connect to MC from
a web browser and allows
communication from nodes to the
MC application/web server. See
Connecting to Management
Console.
4803 TCP Spread Client connections.
6543 UDP Spread Monitor to Daemon connection.
General Operating System Configuration - Automatically
Configured by the Installer
These general Operating System settings are automatically made by the installer if they
do not meet Vertica requirements. You can prevent the installer from automatically
making these configuration changes by using the --no-system-configuration
parameter for the install_vertica script.
sysctl
During installation, Vertica attempts to automatically change various OS level settings.
The installer may not change values on your system if they exceed the threshold
required by the installer. You can prevent the installer from automatically making these
configuration changes by using the --no-system-configuration parameter for the

To permanently edit certain settings and prevent them from reverting on reboot, use
sysctl.
The sysctl settings relevant to the installation of Vertica include:
l min_free_kbytes
l fs.file_max
l vm.max_map_count
Permanently Changing Settings with sysctl:
1. As the root user, open the /etc/sysctl.conf file:
# vi /etc/sysctl.conf
2. Enter a parameter and value:
parameter = value
For example, to set the parameter and value for fs.file-max to meet Vertica
requirements, enter:
fs.file-max = 65536
3. Save your changes, and close the /etc/sysctl.conf file.
4. As the root user, reload the config file:
# sysctl -p
Identifying Settings Added by the Installer
You can see whether the installer has added a setting by opening the /etc/sysctl.conf
file:
# vi /etc/sysctl.conf
If the installer has added a setting, the following line appears:
# The following 1 line added by Vertica tools. 2015-02-23 13:20:29
parameter = value

Nice Limits Configuration
The Vertica system user (dbadmin by default) must be able to raise and lower the
priority of Vertica processes. To do this, the nice option in the
/etc/security/limits.conf file must include an entry for the dbadmin user. The
installer reports this issue with the identifier: S0010.
The installer automatically configures the correct setting if the default value does not
meet system requirements. If there is an issue setting this value, or you have used the -
-no-system-configuration argument to the installer and the current setting is
incorrect, then the installer reports this as an issue.
Note: Vertica never raises priority above the default level of 0. However, Vertica
does lower the priority of certain Vertica threads and needs to able to raise the
priority of these threads back up to the default level. This setting allows Vertica to
raise the priorities back to the default level.
All Systems
To set the Nice Limit configuration for the dbadmin user, edit
/etc/security/limits.conf and add the following line. Replace dbadmin with the
name of your system user.
dbadmin - nice 0
min_free_kbytes Setting
This topic details how to update the min_free_kbytes setting so that it is within the range
supported by Vertica. The installer reports this issue with the identifier: S0050 if the
setting is too low, or S0051 if the setting is too high.
The vm.min_free_kbytes setting configures the page reclaim thresholds. When this
number is increased the system starts reclaiming memory earlier, when its lowered it
starts reclaiming memory later. The default min_free_kbytes is calculated at boot time
based on the number of pages of physical RAM available on the system.
The setting must be the greater of:
l The default value configured by the system, or
l 4096, or
l determine the value from running the command below.

All Systems
To manually set min_free_kbytes:
1. Determine the current/default setting with the following command:
/sbin/sysctl vm.min_free_kbytes
2. If the result of the previous command is No such file or directory or the
default value is less than 4096, then run the command below:
memtot=`grep MemTotal /proc/meminfo | awk '{printf "%.0f",$2}'`
echo "scale=0;sqrt ($memtot*16)" | bc
3. Edit or add the current value of vm.min_free_kbytes in /sbin/sysctl.conf with
the value from the output of the previous command.
# The min_free_kbytes setting
vm.min_free_kbytes=5572
4. Run sysctl -p to apply the changes in sysctl.conf immediately.
Note: These steps will need to be replicated for each node in the cluster.
User Max Open Files Limit
This topic details how to change the user max open-files limit setting to meet Vertica
requirements. The installer reports this issue with the identifier: S0060.
Vertica requires that the dbadmin user not be limited when opening files.The open file
limit should be at least 1 file open per MB of RAM, 65536, or the amount of RAM in MB;
whichever is greater. Vertica sets this to the minimum recommended value of 65536 or
the amount of RAM in MB.

All Systems
The open file limit can be determined by running ulimit -n as the dbadmin user. For
example:
dbadmin@localhost:$ ulimit -n
65536
To manually set the limit, edit /etc/security/limits.conf and edit/add the line for
the nofile setting for the user you configured as the database admin (default dbadmin).
The setting must be at least 65536.
dbadmin - nofile 65536
Note: There is also an open file limit on the system. See System Max Open Files
Limit.
System Max Open Files Limit
This topic details how to modify the limit for the number of open files on your system so
that it meets Vertica requirements. The installer reports this issue with the identifier:
S0120.
Vertica opens many files. Some platforms have global limits on the number of open files.
The open file limit must be set sufficiently high so as not to interfere with database
operations.
The recommended value is at least the amount of memory in MB, but not less than
65536.
All Systems
To manually set the open file limit:
1. Run /sbin/sysctl fs.file-max to determine the current limit.
2. If the limit is not 65536 or the amount of system memory in MB (whichever is higher),
then edit or add fs.file-max=max number of files to /etc/sysctl.conf.

# Controls the maximum number of open files
fs.file-max=65536
Pam Limits
This topic details how to enable the "su" pam_limits.so module required by Vertica. The
installer reports issues with the setting with the identifier: S0070.
On some systems the pam module called pam_limits.so is not set in the file
/etc/pam.d/su. When it is not set, it prevents the conveying of limits (such as open file
descriptors) to any command started with su -.
In particular, the Vertica init script would fail to start Vertica because it calls the
Administration Tools to start a database with the su - command. This problem was first
noticed on Debian systems, but the configuration could be missing on other Linux
distributions. See the pam_limits man page for more details.
All Systems
To manually configure this setting, append the following line to the /etc/pam.d/su file:
session required pam_limits.so
See the pam_limits man page for more details: man pam_limits.
pid_max Setting
This topic details how to change pid_max to a supported value. Vertica requires that
pid_max be set to at least 524288. The installer reports this issue with the identifier:
S0111.

All Systems
To change the pid_max value:
# sysctl -w kernel.pid_max=524288
User Address Space Limits
This topic details how to modify the Linux address space limit for the dbadmin user so
that it meets Vertica requirements. The address space setting controls the maximum
number of threads and processes for each user. If this setting does not meet the
requirements then the installer reports this issue with the identifier: S0090.
The address space available to the dbadmin user must not be reduced via user limits
and must be set to unlimited.
All Systems
To manually set the address space limit:
1. Run ulimit -v as the dbadmin user to determine the current limit.
2. If the limit is not unlimited, then add the following line to
/etc/security/limits.conf. Replace dbadmin with your database admin user
dbadmin - as unlimited
User File Size Limit
This topic details how to modify the file size limit for files on your system so that it meets
Vertica requirements. The installer reports this issue with the identifier: S0100.
The file size limit for the dbadmin user must not be reduced via user limits and must be
set to unlimited.
All Systems
To manually set the file size limit:

1. Run ulimit -f as the dbadmin user to determine the current limit.
2. If the limit is not unlimited, then edit/add the following line to
/etc/security/limits.conf. Replace dbadmin with your database admin user.
dbadmin - fsize unlimited
User Process Limit
This topic details how to change the user process limit so that it meets Vertica
requirements.The installer reports this issue with the identifier: S0110.
The user process limit must be high enough to allow for the many threads opened by
Vertica. The recommended limit is the amount of RAM in MB and must be at least 1024.
All Systems
To manually set the user process limit:
1. Run ulimit -u as the dbadmin user to determine the current limit.
2. If the limit is not the amount of memory in MB on the server, then edit/add the
following line to /etc/security/limits.conf. Replace 4096 with the amount of
system memory, in MB, on the server.
dbadmin - nproc 4096
Maximum Memory Maps Configuration
This topic details how to modify the limit for the number memory maps a process can
have on your system so that it meets Vertica requirements. The installer reports this
issue with the identifier: S0130.
Vertica uses a lot of memory while processing and can approach the default limit for
memory maps per process.

The recommended value is at least the amount of memory on the system in KB / 16, but
not less than 65536.
All Systems
To manually set the memory map limit:
1. Run /sbin/sysctl vm.max_map_count to determine the current limit.
2. If the limit is not 65536 or the amount of system memory in KB / 16 (whichever is
higher), then edit/add the following line to /etc/sysctl.conf. Replace 65536 with
the value for your system.
# The following 1 line added by Vertica tools. 2014-03-07 13:20:31
vm.max_map_count=65536
General Operating System Configuration - Manual Configuration
The following general Operating System settings must be done manually.
Manually Configuring Operating System Settings
Vertica requires that you manually configure some general operating system settings.
Hewlett Packard Enterprise recommends that you configure these settings in the
/etc/rc.local script to prevent them from reverting on reboot. The /etc/rc.local startup script
contains scripts and commands that run each time the system is booted.
Note: SUSE systems use the /etc/init.d/after.local file rather than the etc/rc.local file.
For purposes of using Vertica, the functionality of both files is the same.
Settings to Configure Manually
The /etc/rc.local settings relevant to the installation of Vertica include:
l Disk Readahead
l I/O Scheduling
l Enabling or Disabling Transparent Hugepages

Permanently Changing Settings with /etc/rc.local
1. As the root user, open the /etc/rc.local file:
# vi /etc/rc.local
2. Enter a script or command. For example, to set the transparent hugepages setting to
meet Vertica requirements, enter:
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
Important: On some Ubuntu/Debian systems, the last line in /etc/rc.local must
be "exit 0". Any additions to /etc/rc.local must come before "exit 0".
3. Save your changes, and close the /etc/rc.local file.
On the next reboot, the command runs during startup. You can also run the command
manually, as the root user, if you want it to take effect immediately.
Disk Readahead
This topic details how to change Disk Readahead to a supported value. Vertica requires
that Disk Readahead be set to at least 2048. The installer reports this issue with the
identifier: S0020.
Note:
l These commands must be executed with root privileges and assumes the
blockdev program is in /sbin.
l The blockdev program operates on whole devices, and not individual partitions.
You cannot set the readahead value to different settings on the same device. If
you run blockdev against a partition, for example: /dev/sda1, then the setting is
still applied to the entire /dev/sda device. For instance, running /sbin/blockdev
--setra 2048 /dev/sda1 also causes /dev/sda2 through /dev/sdaN to use a
readahead value of 2048.
RedHat and SuSE Based Systems
For each drive in the Vertica system, Vertica recommends that you set the readahead
value to at least 2048 for most deployments. The command immediately changes the
readahead value for the specified disk. The second line adds the command to
/etc/rc.local so that the setting is applied each time the system is booted. Note that

some deployments may require a higher value and the setting can be set as high as
8192, under guidance of support.
Note: For systems that do not support /etc/rc.local, use the equivalent startup
script that is run after the destination runlevel has been reached. For example SuSE
uses /etc/init.d/after.local.
/sbin/blockdev --setra 2048 /dev/sda
echo '/sbin/blockdev --setra 2048 /dev/sda' >> /etc/rc.local
For each drive in the Vertica system, set the readahead value to 2048. Run the
command once in your shell, then add the command to /etc/rc.local so that the
setting is applied each time the system is booted. Note that on Ubuntu systems, the last
line in rc.local must be "exit 0". So you must manually add the following line to
etc/rc.local before the last line with exit 0.
/sbin/blockdev --setra 2048 /dev/sda
Enabling Network Time Protocol (NTP)
Before you can install Vertica, you must enable Network Time Protocol (NTP) on your
system for clock synchronization. NTP must be both enabled and active at the time of
installation. If NTP is not enabled and active at the time of installation, the installer
reports this issue with the identifier S0030.
On Red Hat 7 and CentOS 7, ntpd has been deprecated in favor of chrony. To see how
to enable chrony, see Enabling chrony for Red Hat 7/CentOS 7 Systems.
Verify That NTP Is Running
The network time protocol (NTP) daemon must be running on all of the hosts in the
cluster so that their clocks are synchronized. The spread daemon relies on all of the
nodes to have their clocks synchronized for timing purposes. If your nodes do not have
NTP running, the installation can fail with a spread configuration error or other errors.
Note: Different Linux distributions refer to the NTP daemon in different ways. For
example, SUSE and Debian/Ubuntu refer to it as ntp, while CentOS and Red Hat

refer to it as ntpd. If the following commands produce errors, try using ntp in place
of ntpd.
To verify that your hosts are configured to run the NTP daemon on startup, enter the
following command:
$ chkconfig --list ntpd
Debian and Ubuntu do not support chkconfig, but they do offer an optional package.
You can install this package with the command sudo apt-get install sysv-rc-
conf. To verify that your hosts are configured to run the NTP daemon on startup with the
sysv-rc-conf utility, enter the following command:
$ sysv-rc-conf --list ntpd
The chkconfig command can produce an error similar to ntpd: unknown service. If
you get this error, verify that your Linux distribution refers to the NTP daemon as ntpd
rather than ntp. If it does not, you need to install the NTP daemon package before you
can configure it. Consult your Linux documentation for instructions on how to locate and
install packages.
If the NTP daemon is installed, your output should resemble the following:
ntp 0:off 1:off 2:on 3:on 4:off 5:on 6:off
The output indicates the runlevels where the daemon runs. Verify that the current
runlevel of the system (usually 3 or 5) has the NTP daemon set to on. If you do not know
the current runlevel, you can find it using the runlevel command:
$ runlevel
N 3
Configure NTP for Red Hat 6/CentOS 6 and SUSE
If your system is based on Red Hat 6/CentOS 6 or SUSE, use the service and
chkconfig utilities to start NTP and have it start at startup.
/sbin/service ntpd restart
/sbin/chkconfig ntpd on
l Red Hat 6/CentOS 6—NTP uses the default time servers at ntp.org. You can
change the default NTP servers by editing /etc/ntpd.conf.

l SUSE—By default, no time servers are configured. You must edit /etc/ntpd.conf
after the install completes and add time servers.
Configure NTP for Ubuntu and Debian
By default, the NTP daemon is not installed on some Ubuntu and Debian systems. First,
install NTP, and then start the NTP process. You can change the default NTP servers
by editing /etc/ntpd.confas shown:
sudo apt-get install ntp
sudo /etc/init.d/ntp reload
Verify That NTP Is Operating Correctly
To verify that the Network Time Protocol Daemon (NTPD) is operating correctly, issue
the following command on all nodes in the cluster.
For Red Hat 6/CentOS 6 and SUSE:
/usr/sbin/ntpq -c rv | grep stratum
For Ubuntu and Debian:
ntpq -c rv | grep stratum
A stratum level of 16 indicates that NTP is not synchronizing correctly.
If a stratum level of 16 is detected, wait 15 minutes and issue the command again. It may
take this long for the NTP server to stabilize.
If NTP continues to detect a stratum level of 16, verify that the NTP port (UDP Port 123)
is open on all firewalls between the cluster and the remote machine you are attempting
to synchronize to.
Red Hat Documentation Related to NTP
The preceding links were current as of the last publication of the Vertica documentation
and could change between releases.
l http://kbase.redhat.com/faq/docs/DOC-6731

Enabling chrony for Red Hat 7/CentOS 7 Systems
Before you can install Vertica, you must enable chrony on your system for clock
synchronization. This implementation of the Network Time Protocol (NTP) must be both
enabled and active at the time of installation. If chrony is not enabled and active at the
time of installation, the installer reports this issue with the identifier S0030.
The chrony suite is comprised of two parts. The daemon, chronyd, is used for clock
synchronization. The command-line utility, chronyc, is used to configure the settings in
chronyd.
Install chrony
chrony is installed by default on some versions of Red Hat/CentOS 7. However, if
chrony is not installed on your system, you must download it. To download chrony, run
the following command as sudo or root:
# yum install chrony
Verify That chrony Is Running
To view the status of the chronyd daemon, run the following command:
$ systemctl status chronyd
If chrony is running, an output similar to the following appears:
chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled)
Active: active (running) since Mon 2015-07-06 16:29:54 EDT; 15s ago
Main PID: 2530 (chronyd)
CGroup: /system.slice/chronyd.service
ââ2530 /usr/sbin/chronyd -u chrony
If chrony is not running, execute the following command as sudo or root. This command
also causes chrony to run at boot time:
# systemctl enable chronyd
Verify That chrony Is Operating Correctly
To verify that the chrony daemon is operating correctly, issue the following command on
all nodes in the cluster:
$ chronyc tracking
An output similar to the following appears:
Reference ID : 198.247.63.98 (time01.website.org)

Stratum : 3
Ref time (UTC) : Thu Jul 9 14:58:01 2015
System time : 0.000035685 seconds slow of NTP time
Last offset : -0.000151098 seconds
RMS offset : 0.000279871 seconds
Frequency : 2.085 ppm slow
Residual freq : -0.013 ppm
Skew : 0.185 ppm
Root delay : 0.042370 seconds
Root dispersion : 0.022658 seconds
Update interval : 1031.0 seconds
Leap status : Normal
A stratum level of 16 indicates that chrony is not synchronizing correctly. If chrony
continues to detect a stratum level of 16, verify that the UDP port 323 is open. This port
must be open on all firewalls between the cluster and the remote machine to which you
are attempting to synchronize.
Red Hat Documentation Related to chrony
These links to Red Hat documentation were current as of the last publication of the
Vertica documentation. Be aware that they could change between releases:
l Configuring NTP Using the chrony Suite
l Using chrony
SELinux Configuration
Vertica does not support SELinux except when SELinux is running in permissive mode.
If it detects that SELinux is installed and the mode cannot be determined the installer
reports this issue with the identifier: S0080. If the mode can be determined, and the
mode is not permissive, then the issue is reported with the identifier: S0081.
Red Hat and SUSE Systems
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
1. Edit /etc/selinux/config and change setting for SELinux to disabled
(SELINUX=disabled). This disables SELinux at boot time.
2. As root/sudo, type setenforce 0 to disable SELinux immediately.
To change SELinux to use permissive mode:

1. Edit /etc/selinux/config and change setting for SELINUX to permissive
(SELINUX=Permissive).
2. As root/sudo, type setenforce Permissive to switch to permissive mode
immediately.
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
1. Edit /selinux/config and change setting for SELinux to disabled
(SELINUX=disabled). This disables SELinux at boot time.
2. As root/sudo, type setenforce 0 to disable SELinux immediately.
To change SELinux to use permissive mode:
1. Edit /selinux/config and change setting for SELinux to permissive
(SELINUX=Permissive).
2. As root/sudo, type setenforce Permissive to switch to permissive mode
immediately.
CPU Frequency Scaling
This topic details the various CPU frequency scaling methods supported by Vertica. In
general, if you do not require CPU frequency scaling, then disable it so as not to impact
system performance.
Important: Your systems may use significantly more energy when frequency scaling
is disabled.
The installer allows CPU frequency scaling to be enabled when the cpufreq scaling
governor is set to performance. If the cpu scaling governor is set to ondemand, and
ignore_nice_load is 1 (true), then the installer fails with the error S0140. If the cpu
scaling governor is set to ondemand and ignore_nice_load is 0 (false), then the
installer warns with the identifier S0141.
CPU frequency scaling is a hardware and software feature that helps computers
conserve energy by slowing the processor when the system load is low, and speeding it
up again when the system load increases. This feature can impact system performance,

since raising the CPU frequency in response to higher system load does not occur
instantly. Always disable this feature on the Vertica database hosts to prevent it from
interfering with performance.
You disable CPU scaling in your host's system BIOS. There may be multiple settings in
your host's BIOS that you need to adjust in order to completely disable CPU frequency
scaling. Consult your host hardware's documentation for details on entering the system
BIOS and disabling CPU frequency scaling.
If you cannot disable CPU scaling through the system BIOS, you can limit the impact of
CPU scaling by disabling the scaling through the Linux kernel or setting the CPU
frequency governor to always run the CPU at full speed.
Caution: This method is not reliable, as some hardware platforms may ignore the
kernel settings. The only reliable method is to disable CPU scaling in BIOS.
The method you use to disable frequency depends on the CPU scaling method being
used in the Linux kernel. See your Linux distribution's documentation for instructions on
disabling scaling in the kernel or changing the CPU governor.
Enabling or Disabling Transparent Hugepages
You can modify transparent hugepages so that the configuration meets Vertica
requirements.
l For Red Hat 7/CentOS 7 systems, you must enable transparent hugepages. The
installer reports this issue with the identifier: S0312.
l For all other systems, you must disable transparent hugepages or set them to
madvise. The installer reports this issue with the identifier: S0310.
Disable Transparent Hugepages on Red Hat 6/CentOS 6 Systems
Important: If you are using Red Hat 7/CentOS 7, you must enable, rather than
disable transparent hugepages. See: Enable Transparent Hugepages on Red Hat
7/CentOS 7 Systems.
Determine if transparent hugepages is enabled. To do so, run the following command.
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
[always] madvise never
The setting returned in brackets is your current setting.

If you are not using madvise or never as your transparent hugepage setting, then you
can disable transparent hugepages in one of two ways:
l Edit your boot loader (for example /etc/grub.conf). Typically, you add the
following to the end of the kernel line. However, consult the documentation for your
system before editing your boot loader configuration.
transparent_hugepage=never
l Edit /etc/rc.local and add the following script.
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
fi
For systems that do not support /etc/rc.local, use the equivalent startup script that is
run after the destination runlevel has been reached. For example SuSE uses
/etc/init.d/after.local.
Regardless of which approach you choose, you must reboot your system for the setting
to take effect, or run the following echo line to proceed with the install without rebooting:
Enable Transparent Hugepages on Red Hat 7/CentOS 7 Systems
Determine if transparent hugepages is enabled. To do so, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
The setting returned in brackets is your current setting.
You can enable transparent hugepages by editing /etc/rc.local and adding the
following script:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo always > /sys/kernel/mm/transparent_hugepage/enabled
fi
You must reboot your system for the setting to take effect, or, as root, run the following
echo line to proceed with the install without rebooting:

# echo always > /sys/kernel/mm/transparent_hugepage/enabled
Disable Transparent Hugepages on Other Systems
Note: SuSE did not offer transparent hugepage support in its initial 11.0 release.
However, subsequent SuSE service packs do include support for transparent
hugepages.
To determine if transparent hugepages is enabled, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
The setting returned in brackets is your current setting. Depending on your platform OS,
the madvise setting may not be displayed.
You can disable transparent hugepages one of two ways:
l Edit your boot loader (for example /etc/grub.conf). Typically, you add the
following to the end of the kernel line. However, consult the documentation for your
system before editing your bootloader configuration.
transparent_hugepage=never
l Edit /etc/rc.local (on systems that support rc.local) and add the following script.
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
Regardless of which approach you choose, you must reboot your system for the setting
to take effect, or run the following two echo lines to proceed with the install without
rebooting:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
Disabling Defrag for Red Hat and CentOS Systems
On all Red Hat and CentOS systems, you must disable the defrag utility to meet Vertica
configuration requirements. The steps necessary to disable defrag on Red Hat

6/CentOS 6 systems differ from those used to disable defrag on Red Hat 7/CentOS 7
systems.
Disable Defrag on Red Hat 6/CentOS 6 Systems
1. Determine if defrag is enabled by running the following command:
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
The setting returned in brackets is your current setting. If you are not using madvise
or never as your defrag setting, then you must disable defrag.
2. Edit /etc/rc.local, and add the following script:
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
fi
You must reboot your system for the setting to take effect, or run the following echo
line to proceed with the install without rebooting:
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
Disable Defrag on Red Hat 7/CentOS 7 Systems
1. Determine if defrag is enabled by running the following command:
cat /sys/kernel/mm/transparent_hugepage/defrag
The setting returned in brackets is your current setting. If you are not using madvise
or never as your defrag setting, then you must disable defrag.
2. Edit /etc/rc.local, and add the following script:
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
You must reboot your system for the setting to take effect, or run the following echo
line to proceed with the install without rebooting:
# echo never > /sys/kernel/mm/transparent_hugepage/defrag

I/O Scheduling
This topic details how to change I/O Scheduling to a supported scheduler. Vertica
requires that I/O Scheduling be set to deadline or noop. If the installer detects that the
system is using an unsupported scheduler, then it reports this issue with the identifier:
S0150. If the installer cannot detect the type of scheduler that the system uses (typically
if your system is using a RAID array) then it reports the issue with identifier: S0151.
If your system is not using a RAID array, then complete the steps below to change your
IO Scheduler to a supported scheduler. If you are using a RAID array then consult the
documentation from your RAID vendor for the best performing scheduler for your
hardware.
Configure the I/O Scheduler
The Linux kernel can use several different I/O schedulers to prioritize disk input and
output. Most Linux distributions use the Completely Fair Queuing (CFQ) scheme by
default, which gives input and output requests equal priority. This scheduler is efficient
on systems running multiple tasks that need equal access to I/O resources. However, it
can create a bottleneck when used on Verticadrives containing the catalog and data
directories, since it gives write requests equal priority to read requests, and its per-
process I/O queues can penalize processes making more requests than other
processes.
Instead of the CFQ scheduler, configure your hosts to use either the Deadline or NOOP
I/O scheduler for the drives containing the catalog and data directories:
l The Deadline scheduler gives priority to read requests over write requests. It also
imposes a deadline on all requests. After reaching the deadline, such requests gain
priority over all other requests. This scheduling methods helps prevent processes
from becoming starved for I/O access. The Deadline scheduler is best used on
physical media drives (disks using spinning platters), since it attempts to group
requests for adjacent sectors on a disk, lowering the time the drive spends seeking.
l The NOOP scheduler uses a simple FIFO approach, placing all input and output
requests into a single queue. This scheduler is best used on solid state drives
(SSDs). Since SSDs do not have a physical read head, no performance penalty
exists when accessing non-adjacent sectors.
Failure to use one of these schedulers for the Vertica drives containing the catalog and
data directories can result in slower database performance. Other drives on the system

(such as the drive containing swap space, log files, or the Linux system files) can still
use the default CFQ scheduler (although you should always use the NOOP scheduler
for SSDs).
There are two ways for you to set the scheduler used by your disk devices:
1. Write the name of the scheduler to a file in the /sys directory.
--or--
2. Use a kernel boot parameter.
Configure the I/O Scheduler - Changing the Scheduler Through the /sys Directory
You can view and change the scheduler Linux uses for I/O requests to a single drive
using a virtual file under the /sys directory. The name of the file that controls the
scheduler a block device uses is:
/sys/block/deviceName/queue/scheduler
Where deviceName is the name of the disk device, such as sda or cciss!c0d1 (the
first disk on an HPE RAID array). Viewing the contents of this file shows you all of the
possible settings for the scheduler, with the currently-selected scheduler surrounded by
square brackets:
# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
To change the scheduler, write the name of the scheduler you want the device to use to
its scheduler file. You must have root privileges to write to this file. For example, to set
the sda drive to use the deadline scheduler, run the following command as root:
# echo deadline > /sys/block/sda/queue/scheduler
# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
Changing the scheduler immediately affects the I/O requests for the device. The Linux
kernel starts using the new scheduler for all of the drive's input and output requests.
Note: While tests have shown no problems are caused by changing the scheduler
settings while Vertica is running, you should strongly consider shutting down any
running Vertica database before changing the I/O scheduler or making any other
changes to the system configuration.

Changes to the I/O scheduler made through the /sys directory only last until the system
is rebooted, so you need to add commands that change the I/O scheduler to a startup
script (such as those stored in /etc/init.d, or though a command in
/etc/rc.local). You also need to use a separate command for each drive on the
system whose scheduler you want to change.
For example, to make the configuration take effect immediately and add it to rc.local so it
is used on subsequent reboots.
echo deadline > /sys/block/sda/queue/scheduler
echo 'echo deadline > /sys/block/sda/queue/scheduler' >> /etc/rc.local
Note: On some Ubuntu/Debian systems, the last line in rc.local must be "exit 0".
So you must manually add the following line to etc/rc.local before the last line
with exit 0.
You may prefer to use this method of setting the I/O scheduler over using a boot
parameter if your system has a mix of solid-state and physical media drives, or has
many drives that do not store Vertica catalog and data directories.
Configure the I/O Scheduler - Changing the Scheduler with a Boot Parameter
Use the elevator kernel boot parameter to change the default scheduler used by all
disks on your system. This is the best method to use if most or all of the drives on your
hosts are of the same type (physical media or SSD) and will contain catalog or data
files. You can also use the boot parameter to change the default to the scheduler the
majority of the drives on the system need, then use the /sys files to change individual
drives to another I/O scheduler. The format of the elevator boot parameter is:
elevator=schedulerName
Where schedulerName is deadline, noop, or cfq. You set the boot parameter using
your bootloader (grub or grub2 on most recent Linux distributions). See your
distribution's documentation for details on how to add a kernel boot parameter.
Support Tools
Vertica suggests that the following tools are installed so support can assist in
troubleshooting your system if any issues arise:

l pstack (or gstack) package. Identified by issue S0040 when not installed.
n On Red Hat 7 and CentOS 7 systems, the pstack package is installed as part of
the gdb package.
l mcelog package. Identified by issue S0041 when not installed.
l sysstat package. Identified by issue S0045 when not installed.
To install the required tools on Red Hat 6 and CentOS 6 systems, run the following
commands as sudo or root:
yum install pstack
yum install mcelog
yum install sysstat
To install the required tools on Red Hat 7/CentOS 7 systems, run the following
yum install gdb
yum install mcelog
yum install sysstat
To install the required tools on Ubuntu and Debian systems, run the following
apt-get install pstack
apt-get install mcelog
apt-get install sysstat
SuSE Systems
To install the required tools on SuSE systems, run the following commands as sudo or
root.
zypper install sysstat
zypper install mcelog
There is no individual SuSE package for pstack/gstack. However, the gdb package
contains gstack, so you could optionally install gdb instead, or build pstack/gstack from
source. To install the gdb package:
zypper install gdb

System User Configuration
The following tasks pertain to the configuration of the system user required by Vertica.
System User Requirements
Vertica has specific requirements for the system user that runs and manages Vertica. If
you specify a user during install, but the user does not exist, then the installer reports
this issue with the identifier: S0200.
System User Requirement Details
Vertica requires a system user to own database files and run database processes and
administration scripts. By default, the install script automatically configures and creates
this user for you with the username dbadmin. See About Linux Users Created by Vertica
and Their Privileges for details on the default user created by the install script. If you
decide to manually create your own system user, then you must create the user before
you run the install script. If you manually create the user:
Note: Instances of dbadmin and verticadba are placeholders for the names you
choose if you do not use the default values.
l the user must have the same username and password on all nodes
l the user must use the BASH shell as the user's default shell. If not, then the installer
reports this issue with identifier [S0240].
l the user must be in the verticadba group (for example: usermod -a -G verticadba
userNameHere). If not, the installer reports this issue with identifier [S0220].
Note: You must create a verticadba group on all nodes. If you do not, then the
installer reports the issue with identifier [S0210].
l the user's login group must be either verticadba or a group with the same name as
the user (for example, the home group for dbadmin is dbadmin). You can check the
groups for a user with the id command. For example: id dbadmin. The "gid" group is
the user's primary group. If this is not configured correctly then the installer reports
this issue with the identifier [S0230]. Vertica recommends that you use verticadba as
the user's primary login group. For example: usermod -g verticadba
userNameHere. If the user's primary group is not verticadba as suggested, then the
installer reports this with HINT [S0231].

l the user must have a home directory. If not, then the installer reports this issue with
identifier [S0260].
l the user's home directory must be owned by the user. If not, then the installer reports
the issue with identifier [S0270].
l the system must be aware of the user's home directory (you can set it with the
usermod command: usermod -m -d /path/to/new/home/dir userNameHere). If
this is not configured correctly then the installer reports the issue with [S0250].
l the user's home directory must be owned by the dbadmin's primary group (use the
chown and chgrp commands if necessary). If this is not configured correctly, then the
installer reports the issue with identifier [S0280].
l the user's home directory should have secure permissions. Specifically, it should not
be writable by anyone or by the group. Ideally the permissions should be, when
viewing with ls, "---" (nothing), or "r-x" (read and execute). If this is not configured
as suggested then the installer reports this with HINT [S0290].
TZ Environment Variable
This topic details how to set or change the TZ environment variable and update your
Tzdata Package. If this variable is not set, then the installer reports this issue with the
identifier: S0305.
Before installing Vertica, update the Tzdata Package for your system and set the default
time zone for your database administrator account by specifying the TZ environmental
variable. If your database administrator is being created by the install_vertica
script, then set the TZ variable after you have installed Vertica.
Update Tzdata Package
The tzdata package is a public-domain time zone database that is pre-installed on most
linux systems. The tzdata package is updated periodically for time-zone changes across
the world. HPErecommends that you update to the latest tzdata package before
installing or updating Vertica.
Update your tzdata package with the following command:

l For RedHat based systems: yum update tzdata
l For Debian and Ubuntu systems: apt-get install tzdata
Setting the Default Time Zone
When a client receives the result set of a SQL query, all rows contain data adjusted, if
necessary, to the same time zone. That time zone is the default time zone of the initiator
node unless the client explicitly overrides it using the SQL SET TIME ZONE command
described in the SQL Reference Manual. The default time zone of any node is
controlled by the TZ environment variable. If TZ is undefined, the operating system time
zone.
Important: The TZ variable must be set to the same value on all nodes in the cluster.
If your operating system timezone is not set to the desired timezone of the database then
make sure that the Linux environment variable TZ is set to the desired value on all
cluster hosts.
The installer returns a warning if the TZ variable is not set. If your operating system
timezone is appropriate for your database, then the operating system timezone is used
and the warning can be safely ignored.
Setting the Time Zone on a Host
Important: If you explicitly set the TZ environment variable at a command line
before you start the Administration Tools, the current setting will not take effect. The
Administration Tools uses SSH to start copies on the other nodes, so each time SSH
is used, the TZ variable for the startup command is reset. TZ must be set in the
.profile or .bashrc files on all nodes in the cluster to take affect properly.
You can set the time zone several different ways, depending on the Linux distribution or
the system administrator’s preferences.
l To set the system time zone on Red Hat and SUSE Linux systems, edit:
/etc/sysconfig/clock
l To set the TZ variable, edit, /etc/profile, or /dbadmin/.bashrc or
/home/dbadmin/.bash_profile and add the following line (for example, for the US
Eastern Time Zone):

export TZ="America/New_York"
For details on which timezone names are recognzied by Vertica, see the appendix:
Using Time Zones With Vertica.
LANG Environment Variable Settings
This topic details how to set or change the LANG environment variable. The
LANG environment variable controls the locale of the host. If this variable is not set, then
the installer reports this issue with the identifier: S0300. If this variable is not set to a
valid value, then the installer reports this issue with the identifier: S0301.
Set the Host Locale
Each host has a system setting for the Linux environment variable LANG. LANG
determines the locale category for native language, local customs, and coded character
set in the absence of the LC_ALL and other LC_ environment variables. LANG can be
used by applications to determine which language to use for error messages and
instructions, collating sequences, date formats, and so forth.
To change the LANG setting for the database administrator, edit, /etc/profile, or
/dbadmin/.bashrc or /home/dbadmin/.bash_profile on all cluster hosts and set
the environment variable; for example:
export LANG=en_US.UTF-8
The LANG setting controls the following in Vertica:
l OS-level errors and warnings, for example, "file not found" during COPY operations.
l Some formatting functions, such as TO_CHAR and TO_NUMBER. See also
Template Patterns for Numeric Formatting.
The LANG setting does not control the following:
l Vertica specific error and warning messages. These are always in English at this
time.
l Collation of results returned by SQL issued to Vertica. This must be done using a
database parameter instead. See Implement Locales for International Data Sets
section in the Administrator's Guide for details.

Note: If the LC_ALL environment variable is set, it supersedes the setting of LANG.
Package Dependencies
For successful Vertica installation, you must first install three packages on all nodes in
your cluster before installing the database platform.
The required packages are:
l openssh—Required for Administration Tools connectivity between nodes.
l which—Required for Vertica operating system integration and for validating
installations.
l dialog—Required for interactivity with Administration Tools.
Installing the Required Packages
The procedure you follow to install the required packages depends on the operating
system on which your node or cluster is running. See your operating system's
documentation for detailed information on installing packages.
l For CentOS/Red Hat Systems—Typically, you manage packages on Red Hat and
CentOS systems using the yum utility.
Run the following yum commands to install each of the package dependencies. The
yum utility guides you through the installation:
# yum install openssh
# yum install which
# yum install dialog
l For Debian/Ubuntu Systems—Typically, you use the apt-get utility to manage
packages on Debian and Ubuntu systems.
Run the following apt-get commands to install each of the package dependencies. The
apt-get utility guides you through the installation:
# apt-get install openssh
# apt-get install which
# apt-get install dialog

Installing Vertica
There are different paths you can take when installing Vertica. You can:
l Install Vertica on one or more hosts using the command line, and not use the
Management Console.
l Install the Management Console, and from the Management Console install Vertica
on one or more hosts by using the Management Console cluster creation wizard.
l Install Vertica on one or more hosts using the command line, then install the
Management Console and import the cluster to be managed.
Installing Using the Command Line
Although HPE supports installation on one node, two nodes, and multiple nodes, this
section describes how to install the Vertica software on a cluster of nodes. It assumes
that you have already performed the tasks in Before You Install Vertica, and that you
have a Vertica license key.
To install Vertica, complete the following tasks:
1. Download and install the Vertica server package
2. Installing Vertica with the install_vertica Script
Special notes
l Downgrade installations are not supported.
l Be sure that you download the RPM for the correct operating system and
architecture.
l Vertica supports two-node clusters with zero fault tolerance (K=0 safety). This
means that you can add a node to a single-node cluster, as long as the installation
node (the node upon which you build) is not the loopback node
(localhost/127.0.0.1).
l The Version 7.0 installer introduces new platform verification tests that prevent the
install from continuing if the platform requirements are not met by your system.
Manually verify that your system meets the requirements in Before You Install

Vertica on your systems. These tests ensure that your platform meets the hardware
and software requirements for Vertica. Previous versions documented these
requirements, but the installer did not verify all of the settings. If this is a fresh
install, then you can simply run the installer and view a list of the failures and
warnings to determine which configuration changes you must make.
Back Up Existing Databases
If you are doing an upgrade installation, back up the following for all existing databases:
l The Catalog and Data directories, using the Vertica backup utility. See Backing Up
and Restoring the Database in the Administrator's Guide.
l /opt/vertica/, using manual methods. For example:
a. Enter the command:
tar -czvf /tmp/vertica.tgz /opt/vertica
b. Copy the tar file to a backup location.
Backing up MC
Before you upgrade MC, HPE recommends that you back up your MC metadata
(configuration and user settings). Use a storage location external to the server on which
you installed MC.
1. On the target server (where you want to store MC metadata), log in as root or a user
with sudo privileges.
2. Create a backup directory as in following example:
# mkdir /backups/mc/mc-backup-20130425
3. Copy the /opt/vconsole directory to the new backup folder:
# cp –r /opt/vconsole /backups/mc/mc-backup-20130425
After you have completed the backup tasks, proceed to Upgrading Vertica to a New
Version.

Download and Install the Vertica Server Package
To Download and Install the Vertica Server Package:
1. Use a Web browser to log in to myVertica portal.
2. Click the Download tab and download the Vertica server package to the
Administration Host.
Be sure the package you download matches the operating system and the machine
architecture on which you intend to install it. In the event of a node failure, you can
use any other node to run the Administration Tools later.
3. If you installed a previous version of Vertica on any of the hosts in the cluster, use
the Administration Tools to shut down any running database.
The database must stop normally; you cannot upgrade a database that requires
recovery.
4. If you are using sudo, skip to the next step. If you are root, log in to the
Administration Host as root (or log in as another user and switch to root).
$ su - root
password: root-password
#
Caution: When installing Vertica using an existing user as the dba, you must
exit all UNIX terminal sessions for that user after setup completes and log in
again to ensure that group privileges are applied correctly.
After Vertica is installed, you no longer need root privileges. To verify sudo, see
General Hardware and OS Requirements and Recommendations.
5. Use one of the following commands to run the RPM package installer:
n If you are root and installing an RPM:
# rpm -Uvh pathname
n If you are using sudo and installing an RPM:

$ sudo rpm -Uvh pathname
n If you are using Debian, replace rpm -Uvh with dpkg -i
where pathname is the Vertica package file you downloaded.
Note: If the package installer reports multiple dependency problems, or you
receive the error "ERROR: You're attempting to install the wrong RPM for this
operating system", then you are trying to install the wrong Vertica server
package. Make sure that the machine architecture (32-bit or 64-bit) of the
package you downloaded matches the operating system.
Installing Vertica with the install_vertica Script
About the Installation Script
You run the install script after you have installed the Vertica package. The install script
is run on a single node, using a Bash shell, and it copies the Vertica package to all other
hosts (identified by the --hosts argument) in your planned cluster.
The install script runs several tests on each of the target hosts to verify that the hosts
meet the system and performance requirements for a Vertica node. The install script
modifies some operating system configuration settings to meet these requirements.
Other settings cannot be modified by the install script and must be manually re-
configured.
The installation script takes the following basic parameters:
l A list of hosts on which to install.
l Optionally, the Vertica RPM/DEB path and package file name if you have not pre-
installed the server package on other potential hosts in the cluster.
l Optionally, a system user name. If you do not provide a user name, then the install
script creates a new system user named dbadmin. If you do provide a username and
the username does not exist on the system, then the install script creates that user.
For example:
# /opt/vertica/sbin/install_vertica --hosts node01,node02,node03 --rpm /tmp/vertica_7.2.x.x86_
64.RHEL6.rpm --dba-user mydba

Note: The install script sets up passwordless ssh for the administrator user across
all the hosts. If passwordless ssh is already setup, the install script verifies that it is
functioning correctly.
To Perform a Basic Install of Vertica:
1. As root (or sudo) run the install script. The script must be run in a BASH shell as root
or as a user with sudo privileges. There are many options you can configure when
running the install script. See install_vertica Options below for the complete list of
options.
If the installer fails due to any requirements not being met, you can correct the issue
and then re-run the installer with the same command line options.
To perform a basic install:
n As root:
# /opt/vertica/sbin/install_vertica --hosts host_list --rpm package_name --dba-user dba_
username
n Using sudo:
$ sudo /opt/vertica/sbin/install_vertica --hosts host_list --rpm package_name --dba-user
dba_username
Basic Installation Parameters
Option Description
--hosts host_list
A comma-separated list of IP addresses to include in the
cluster; do not include space characters in the list.
Examples:
--hosts 127.0.0.1
--hosts 192.168.233.101,192.168.233.102,192.168.233.103
Note: Vertica stores only IP addresses in its
configuration files. You can provide a hostname to the -

Option Description
-hosts parameter, but it is immediately converted to an
IP address when the script is run.
--rpm package_name
--deb package_name
The path and name of the Vertica RPM package. Example:
--rpm /tmp/vertica_7.2.x.x86_64.RHEL6.rpm
For Debian and Ubuntu installs, provide the name of the
Debian package, for example:
--deb /tmp/vertica_7.2.x86.deb
--dba-user dba_username
The name of the Database Administrator system account to
create. Only this account can run the Administration Tools. If
you omit the --dba-user parameter, then the default
database administrator account name is dbadmin.
This parameter is optional for new installations done as root
but must be specified when upgrading or when installing
using sudo. If upgrading, use the -u parameter to specify
the same DBA account name that you used previously. If
installing using sudo, the user must already exist.
Note: If you manually create the user, modify the user's
.bashrc file to include the line:
PATH=/opt/vertica/bin:$PATH so that the Vertica
tools such as vsql and admintools can be easily started
by the dbadmin user.
2. When prompted for a password to log into the other nodes, provide the requested
password. This allows the installation of the package and system configuration on
the other cluster nodes. If you are root, this is the root password. If you are using
sudo, this is the sudo user password. The password does not echo on the command

line. For example:
Vertica Database 7.0 Installation Tool
Please enter password for root@host01:password
3. If the dbadmin user, or the user specified in the argument --dba-user, does not
exist, then the install script prompts for the password for the user. Provide the
password. For example:
Enter password for new UNIX user dbadmin:password
Retype new UNIX password for user dbadmin:password
4. Carefully examine any warnings or failures returned by install_vertica and
correct the problems.
For example, insufficient RAM, insufficient network throughput, and too high
readahead settings on the filesystem could cause performance problems later on.
Additionally, LANG warnings, if not resolved, can cause database startup to fail and
issues with VSQL. The system LANG attributes must be UTF-8 compatible. Once
you fix the problems, re-run the install script.
5. Once installation is successful, disconnect from the Administration Host, as
instructed by the script; then complete the required post-installation steps.
At this point, root privileges are no longer needed and the database administrator
can perform any remaining steps.
To Complete Required Post-install Steps:
1. Log in to the Database Administrator account on the administration host.
2. Install the License Key
3. Accept the EULA.
4. If you have not already done so, proceed to Getting Started. Otherwise, proceed to
Configuring the Database in the Administrator's Guide.
install_vertica Options
The table below details all of the options available to the install_vertica script. Most
options have a long and short form. For example --hosts is interchangeable with -s.

the only required options are --hosts/-s and --rpm/--deb/-r.
Option
(long form, short form)
Description
--help
Display help for this script.
--hosts host_list,
-s host_list
A comma-separated list of host names or IP
addresses to include in the cluster. Do not
include spaces in the list. The IP addresses or
hostnames must be for unique hosts. You
cannot list the same host using multiple IP
addresses/hostnames.
Examples:
--hosts host01,host02,host03
-s 192.168.233.101,192.168.233.102,192.168.233.103
Note: If you are upgrading an existing
installation of Vertica, be sure to use the
same host names that you used previously.
--rpm package_name,
--deb package_name,
-r package_name
The name of the RPM or Debian package. The
install package must be provided if you are
installing or upgrading multiple nodes and the
nodes do not have the latest server package
installed, or if you are adding a new node. The
install_vertica and update_vertica
scripts serially copy the server package to the
other nodes and install the package. If you are
installing or upgrading a large number of
nodes, then consider manually installing the
package on all nodes before running the
upgrade script, as the script runs faster if it
does not need to serially upload and install the
package on each node.
Example:

Option
Description
--rpm vertica_7.2.x.x86_64.RHEL6.rpm
--data-dir data_directory,
-d data_directory
The default directory for database data and
catalog files. The default is /home/dbadmin.
Note: Do not use a shared directory over
more than one host for this setting. Data
and catalog directories must be distinct for
each node. Multiple nodes must not be
allowed to write to the same data or catalog
directory.
--temp-dir directory
The temporary directory used for administrative
purposes. If it is a directory within
/opt/vertica, then it will be created by the
installer. Otherwise, the directory should
already exist on all nodes in the cluster. The
location should allow dbadmin write privileges.
The default is /tmp.
Note: This is not a temporary data location
for the database.
--dba-user dba_username,
-u dba_username
The name of the Database Administrator
system account to create. Only this account
can run the Administration Tools. If you omit
the --dba-user parameter, then the default
database administrator account name is
dbadmin.
This parameter is optional for new installations
done as root but must be specified when
upgrading or when installing using sudo. If
upgrading, use the -u parameter to specify the
same DBA account name that you used

Option
Description
previously. If installing using sudo, the user
must already exist.
Note: If you manually create the user,
modify the user's .bashrc file to include the
line: PATH=/opt/vertica/bin:$PATH so
that the Vertica tools such as vsql and
admintools can be easily started by the
dbadmin user.
--dba-group GROUP,
-g GROUP
The UNIX group for DBA users. The default is
verticadba.
--dba-user-home dba_home_directory,
-l dba_home_directory
The home directory for the database
administrator. The default is /home/dbadmin.
--dba-user-password
dba_password,
-p dba_password
The password for the database administrator
account. If not supplied, the script prompts for a
password and does not echo the input.
--dba-user-password-disabled
Disable the password for the --dba-user.
This argument stops the installer from
prompting for a password for the --dba-user.
You can assign a password later using
standard user management tools such as
passwd.
--spread-logging,
-w
Configures spread to output logging output to
/opt/vertica/log/spread_
<hostname>.log. Does not apply to
upgrades.
Note: Do not enable this logging unless
directed to by Vertica Analytics Platform
Technical Support.

Option
Description
--ssh-password password,
-P password
The password to use by default for each
cluster host. If not supplied, and the -i option is
not used, then the script prompts for the
password if and when necessary and does not
echo the input. Do not use with the -i option.
Special note about password:
If you run the install_vertica script as root,
specify the root password with the -P
parameter:
# /opt/vertica/sbin/install_vertica -P <root_passwd>
If, however, you run the install_vertica
script with the sudo command, the password
for the -P parameter should be the password of
the user who runs install_vertica, not the
root password. For example if user dbadmin
runs install_vertica with sudo and has a
password with the value dbapasswd, then the
value for -P should be dbapasswd:
$ sudo /opt/vertica/sbin/install_vertica -P dbapasswd
--ssh-identity file,
-i file
The root private-key file to use if passwordless
ssh has already been configured between the
hosts. Verify that normal SSH works without a
password before using this option. The file can
be private key file (for example, id_rsa), or
PEM file. Do not use with the --ssh-
password/-P option.
Vertica accepts the following:
l By providing an SSH private key which is
not password protected. You cannot run the
install_verticascript with the sudo
command when using this method.

Option
Description
l By providing a password-protected private
key and using an SSH-Agent. Note that
sudo typically resets environment variables
when it is invoked. Specifically, the SSH_
AUTHSOCK variable required by the SSH-
Agent may be reset. Therefore, configure
your system to maintain SSH_AUTHSOCK
or invoke the install_vertica command using
a method similar to the following: sudo
SSH_AUTHSOCK=$SSH_AUTHSOCK
/opt/vertica/sbin/install_vertica
...
--config-file file,
-z file
Accepts an existing properties file created by -
-record-config file_name. This properties
file contains key/value parameters that map to
values in the install_vertica script, many
with Boolean arguments that default to false.
--add-hosts host_list,
-A host_list
A comma-separated list of hosts to add to an
existing Vertica cluster.
--add-hosts modifies an existing installation
of Vertica by adding a host to the database
cluster and then reconfiguring the spread. This
is useful for increasing system performance or
setting K-safety to one (1) or two (2).
Notes:
l If you have used the -T parameter to
configure spread to use direct point-to-point
communication within the existing cluster,
you must use the -T parameter when you
add a new host; otherwise, the new host
automatically uses UDP broadcast traffic,

Option
Description
resulting in cluster communication problems
that prevent Vertica from running properly.
Examples:
--add-hosts host01
--add-hosts 192.168.233.101
l The update_vertica script described in
Adding Nodes calls the install_vertica
script to update the installation. You can use
either the install_vertica or update_
vertica script with the --add-hosts
parameter.
--record-config file_name,
-B file_name
Accepts a file name, which when used in
conjunction with command line options,
creates a properties file that can be used with
the --config-file parameter. This
parameter creates the properties file and exits;
it has no impact on installation.
--clean
Forcibly cleans previously stored configuration
files. Use this parameter if you need to change
the hosts that are included in your cluster. Only
use this parameter when no database is
defined. Cannot be used with update_
vertica.
--license { license_file | CE },
-L { license_file | CE }
Silently and automatically deploys the license
key to /opt/vertica/config/share. On
multi-node installations, the –-license option
also applies the license to all nodes declared
in the --hosts host_list. To activate your
license, use the –-license option with the –-

Option
Description
accept-eula option. If you do not use the –-
accept-eula option, you are asked to accept
the EULA when you connect to your database.
Once you accept the EULA, your license is
activated.
If specified with CE, automatically deploys the
Community Edition license key, which is
included in your download. You do not need to
specify a license file.
Examples:
--license CE
--license /tmp/vlicense.dat
--remove-hosts host_list,
-R host_list
A comma-separated list of hosts to remove
from an existing Vertica cluster.
--remove-hosts modifies an existing
installation of Vertica by removing a host from
the database cluster and then reconfiguring the
spread. This is useful for removing an obsolete
or over-provisioned system. For example:
---remove-hosts host01
-R 192.168.233.101
Notes:
l If you used the -T parameter to configure
spread to use direct point-to-point
communication within the existing cluster,
you must use -T when you remove a host;
otherwise, the hosts automatically use UDP
broadcast traffic, resulting in cluster
communication problems that prevents
Vertica from running properly.

Option
Description
l The update_vertica script described in
Removing Nodes in the Administrator's
Guide calls the install_vertica script to
perform the update to the installation. You
can use either the install_vertica or
update_vertica script with the -R
parameter.
--control-network { BCAST_ADDR | default },
-S { BCAST_ADDR | default }
Takes either the value 'default' or a broadcast
network IP address (BCAST_ADDR) to allow
spread communications to be configured on a
subnet that is different from other Vertica data
communications. --control-network is also
used to force a cluster-wide spread
reconfiguration when changing spread related
options.
Note: The --control-network must
match the subnet for at least some of the
nodes in the database. If the provided
address does not match the subnet of any
node in the database then the installer
displays an error and stops. If the provided
address matches some, but not all of the
node's subnets, then a warning is
displayed, but the install continues. Ideally,
the value for --control-network should
match all node subnets.
Examples:
--control-network default
--control-network 10.20.100.255
--point-to-point,
-T
Configures spread to use direct point-to-point

Option
Description
communication between all Vertica nodes.
You should use this option if your nodes aren't
located on the same subnet. You should also
use this option for all virtual environment
installations, regardless of whether the virtual
servers are on the same subnet or not. The
maximum number of spread daemons
supported in point-to-point communication in
Vertica 7.1 is 80. It is possible to have more
than 80 nodes by using large cluster mode,
which does not install a spread daemon on
each node.
Cannot be used with --broadcast, as the
setting must be either --broadcast or --
point-to-point.
Important: When changing the configuration
from --broadcast (the default) to --point-
to-point or from --point-to-point to --
broadcast, the --control-network
parameter must also be used.
Note: Spread always runs on UDP. -T
does not denote TCP.
--broadcast,
-U
Specifies that Vertica use UDP broadcast
traffic by spread between nodes on the subnet.
This parameter is automatically used by
default. No more than 80 spread daemons are
supported by broadcast traffic. It is possible to
have more than 80 nodes by using large
cluster mode, which does not install a spread
daemon on each node.
Cannot be used with --point-to-point, as

Option
Description
the setting must be either --broadcast or --
point-to-point.
Important: When changing the configuration
from --broadcast (the default) to --point-
to-point or from --point-to-point to --
broadcast, the --control-network
parameter must also be used.
Note: Spread always runs on UDP. -U
does not mean use UDP instead of TCP.
--accept-eula,
-Y
Silently accepts the EULA agreement. On
multi-node installations, the --accept-eula
value is propagated throughout the cluster at
the end of the installation, at the same time as
the Administration Tools metadata.
Use the --accept-eula option with the --
license option to activate your license.
--no-system-configuration
By default, the installer makes system
configuration changes to meet server
requirements. If you do not want the installer to
change any system properties, then use the --
no-system-configuration. The installer
presents warnings or failures for configuration
settings that do not meet requirements that it
normally would have automatically configured.
Note: The system user account is still
created/updated when using this
parameter.
--failure-threshold
Stops the installation when the specified
failure threshold is encountered.

Option
Description
Options can be one of:
l HINT - Stop the install if a HINT or greater
issue is encountered during the installation
tests. HINT configurations are settings you
should make, but the database runs with no
significant negative consequences if you
omit the setting.
l WARN (default) - Stop the installation if a
WARN or greater issue is encountered.
WARN issues may affect the performance of
the database. However, for basic testing
purposes or Community Edition users,
WARN issues can be ignored if extreme
performance is not required.
l FAIL - Stop the installation if a FAIL or
greater issue is encountered. FAIL issues
can have severely negative performance
consequences and possible later
processing issues if not addressed.
However, Vertica can start even if FAIL
issues are ignored.
l HALT - Stop the installation if a HALT or
greater issue is encountered. The database
may not be able to be started if you choose
his option. Not supported in production
environments.
l NONE - Do not stop the installation. The
database may not start. Not supported in
production environments.

Option
Description
--large-cluster,
-2
[ <integer> | DEFAULT ]
Enables a large cluster layout, in which control
message responsibilities are delegated to a
subset of Vertica Analytics Platform nodes
(called control nodes) to improve control
message performance in large clusters.
Consider using this parameter with more than
50 nodes.
Options can be one of:
l <integer>—The number of control nodes
you want in the cluster. Valid values are 1 to
120 for all new databases.
l DEFAULT—Vertica Analytics Platform
chooses the number of control nodes using
calculations based on the total number of
cluster nodes in the --hosts argument.
For more information, see Large Cluster in the
Installing Vertica Silently
This section describes how to create a properties file that lets you install and deploy
Vertica-based applications quickly and without much manual intervention.
Note: The procedure assumes that you have already performed the tasks in Before
You Install Vertica.
Install the properties file:
1. Download and install the Vertica install package, as described in Installing Vertica.
2. Create the properties file that enables non-interactive setup by supplying the
parameters you want Vertica to use. For example:

The following command assumes a multi-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --
accept-eula
# --dba-user-password password --ssh-password password --hosts host_list --rpm package_name
The following command assumes a single-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --
accept-eula
# --dba-user-password password
Option Description
--record-file file_name
[Required] Accepts a file name, which when used in
conjunction with command line options, creates a
properties file that can be used with the --config-
file option during setup. This flag creates the
properties file and exits; it has no impact on
installation.
--license { license_file | CE }
Silently and automatically deploys the license key to
/opt/vertica/config/share. On multi-node installations,
the –-license option also applies the license to all
nodes declared in the --hosts host_list.
If specified with CE, automatically deploys the
Community Edition license key, which is included in
your download. You do not need to specify a license
file.
--accept-eula
Silently accepts the EULA agreement during setup.
--dba-user-password password
The password for the Database Administrator
account; if not supplied, the script prompts for the
password and does not echo the input.
--ssh-password password
The root password to use by default for each cluster
host; if not supplied, the script prompts for the

Option Description
password if and when necessary and does not echo
the input.
--hosts host_list
A comma-separated list of hostnames or IP
addresses to include in the cluster; do not include
space characters in the list.
Examples:
--hosts host01,host02,host03
--hosts 192.168.233.101,192.168.233.102,192.168.233.103
--rpm package_name
--deb package_name
The name of the RPM or Debian package that
contained this script.
Example:
--rpm vertica_7.2.x.x86_64.RHEL6.rpm
This parameter is required on multi-node installations
if the RPM or DEB package is not already installed
on the other hosts.
See Installing Vertica with the install_vertica Script for the complete set of
installation parameters.
Tip: Supply the parameters to the properties file once only. You can then install
Vertica using just the --config-file parameter, as described below.
3. Use one of the following commands to run the installation script.
n If you are root:
/opt/vertica/sbin/install_vertica --config-file file_name
n If you are using sudo:
$ sudo /opt/vertica/sbin/install_vertica --config-file file_name

--config-file file_name accepts an existing properties file created by --
record-config file_name. This properties file contains key/value parameters
that map to values in the install_vertica script, many with boolean arguments
that default to false
The command for a single-node install might look like this:
# /opt/vertica/sbin/install_vertica --config-file /tmp/vertica-inst.prp
4. If you did not supply a --ssh-password password parameter to the properties file,
you are prompted to provide the requested password to allow installation of the
RPM/DEB and system configuration of the other cluster nodes. If you are root, this is
the root password. If you are using sudo, this is the sudo user password. The
password does not echo on the command line.
Note: If you are root on a single-node installation, you are not prompted for a
password.
5. If you did not supply a --dba-user-password password parameter to the
properties file, you are prompted to provide the database administrator account
password.
The installation script creates a new Linux user account (dbadmin by default) with
the password that you provide.
6. Carefully examine any warnings produced by install_vertica and correct the
problems if possible. For example, insufficient RAM, insufficient Network throughput
and too high readahead settings on filesystem could cause performance problems
later on.
Note: You can redirect any warning outputs to a separate file, instead of having
them display on the system. Use your platforms standard redirected
machanisms. For example: install_vertica [options] > /tmp/file
1>&2.
7. Optionally perform the following steps:

n Install the ODBC and JDBC driver.
n Install the vsql client application on non-cluster hosts.
8. Disconnect from the Administration Host as instructed by the script. This is required
to:
n Set certain system parameters correctly.
n Function as the Vertica database administrator.
At this point, Linux root privileges are no longer needed. The database administrator
can perform the remaining steps.
Note: When creating a new database, the database administrator might want to
use different data or catalog locations than those created by the installation
script. In that case, a Linux administrator might need to create those directories
and change their ownership to the database administrator.
l If you supplied the --license and --accept-eula parameters to the properties file,
then proceed to the Getting Started and then see Configuring the Database in the
Administrator's Guide. Otherwise:
1. Log in to the Database Administrator account on the administration host.
2. Accept the End User License Agreement and install the license key you
downloaded previously as described in Install the License Key.
3. Proceed to Getting Started and then see Configuring the Database in the
Notes
l The following is an example of the contents of the configuration properties file:
accept_eula = Truelicense_file = /tmp/license.txt
record_to = file_name
root_password = password
vertica_dba_group = verticadba

vertica_dba_user = dbadmin
vertica_dba_user_password = password
Installing Vertica on Amazon Web Services (AWS)
Beginning with Vertica 6.1.x, you can use Vertica on AWS by utilizing a pre-configured
Amazon Machine Image (AMI). For details on installing and configuring a cluster on
AWS, refer to About Using Vertica on Amazon Web Services (AWS).
Installing and Configuring Management Console
This section describes how to install, configure, and upgrade Management Console
(MC). If you need to back up your instance of MC, see Backing Up MC in the
You can install MC before or after you install Vertica; however, consider installing
Vertica and creating a database before you install MC. After you finish configuring MC, it
automatically discovers your running database cluster, saving you the task of importing
it manually.
Before You Install MC
Each version of Vertica Management Console (MC) is compatible only with the
matching version of the Vertica server. For example, Vertica 7.1.0 server is supported
with Vertica 7.1.0 MC only. Read the following documents for more information:
l Supported Platforms document, at http://my.vertica.com/docs. The Supported
Platforms document also lists supported browsers for MC.
l Installation Overview and Checklist. Make sure you have everything ready for your
Vertica configuration.
l Before You Install Vertica. Read for required prerequisites for all Vertica
configurations, including Management Console.
Driver Requirements for Linux SuSe Distributions
The MC (vertica-console) package contains the Oracle Implementation of Java 6
JRE and requires that you install the unixODBC driver manager on SuSe Linux
platforms. unixODBC provides needed libraries libodbc and lidodbcinst.
Port Requirements
When you use MC to create a Vertica cluster, the Create Cluster Wizard uses SSH on
its default port (22).

Port 5444 is the default agent port and must be available for MC-to-node and node-to-
node communications.
Port 5450 is the default MC port and must be available for node-to-MC communications.
See Ensure Ports Are Available for more information about port and firewall
considerations.
Make sure that a firewall or iptables are not blocking communications between the
cluster's database, Management Console, and MC's agents on each cluster node.
IP Address Requirements
If you install MC on a server outside the Vertica cluster it will be monitoring, that server
must be accessible to at least the public network interfaces on the cluster.
Disk Space Requirements
You can install MC on any node in the cluster, so there are no special disk requirements
for MC—other than disk space you would normally allocate for your database cluster.
See Disk Space Requirements for Vertica.
Time Synchronization and MC's Self-Signed Certificate
When you connect to MC through a client browser, Vertica assigns each HTTPS
request a self-signed certificate, which includes a timestamp. To increase security and
protect against password replay attacks, the timestamp is valid for several seconds only,
after which it expires.
To avoid being blocked out of MC, synchronize time on the hosts in your Vertica cluster,
and on the MC host if it resides on a dedicated server. To recover from loss or lack of
synchronization, resync system time and the Network Time Protocol. See Set Up Time
Synchronization in Installing Vertica.
SSL Requirements
The openssl package must be installed on your Linux environment so SSL can be set
up during the MC configuration process. See SSL Overview in the Administrator's
Guide.
File Permission Requirements
On your local workstation, you must have at least read/write privileges on any files you
plan to upload to MC through the Cluster Installation Wizard. These files include the
Vertica server package, the license key (if needed), the private key file, and an optional
CSV file of IP addresses.

Monitor Resolution
Management Console requires a minimum resolution of 1024 x 768, but HPE
recommends higher resolutions for optimal viewing.
Installing Management Console
You can install Management Console on any node you plan to include in the Vertica
database cluster, as well as on its own, dedicated server outside the cluster.
Install Management Console on the MC Server
1. Download the MC package (vertica-console-<current-version>.<Linux-
distro>) from myVertica portal and save it to a location on the target server, such
as /tmp.
2. On the target server, log in as root or a user with sudo privileges.
3. Change directory to the location where you saved the MC package.
4. Install MC using your local Linux distribution package management system (for
example, rpm, yum, zipper, apt, dpkg).
The following command is a generic example for Red Hat 6:
# rpm -Uvh vertica-console-<current-version>.x86_64.RHEL6.rpm
The following command is a generic example for Debian and Ubuntu:
# dpkg -i vertica-console-<current-version>.deb
5. If you have stopped the database before upgrading MC, start the database again.
As the root user, use the following command:
/etc/init.d/verticad start
6. Open a browser and enter the IP address or host name of the server on which you
installed MC, as well as the default MC port 5450.
For example, you'll enter one of:

https://xx.xx.xx.xx:5450/ https://hostname:5450/
7. When the Configuration Wizard dialog box appears, proceed to Configuring MC.
See Also
l Upgrading Management Console
Configuring MC
After you install MC, you need to configure it through a client browser connection. An
MC configuration wizard walks you through creating the Linux MC super administrator
account, storage locations, and other settings that MC needs to run. Information you
provide during the configuration process is stored in the
/opt/vconsole/config/console.properties file.
If you need to change settings after the configuration wizard ends, such as port
assignments, you can do so later through Home > MC Settings page.
How to Configure MC
1. Open a browser session.
2. Enter the IP address or host name of the server on which you installed MC (or any
cluster node's IP/host name if you already installed Vertica), and include the default
MC port 5450. For example, you'll enter one of:
3. Follow the configuration wizard.
About Authentication for the MC Super Administrator
In the final step of the configuration process, you choose an authentication method for
the MC super administrator. You can decide to have MC authenticate the MC super (in
which case the process is complete), or you can choose LDAP.
If you choose LDAP, provide the following information for the newly-created MC super
administrator:
l Corporate LDAP service host (IP address or host name)
l LDAP server running port (default 389)

l LDAP DN (distinguished name) for base search/lookup/authentication criteria
At a minimum, specify the dc (domain component) field. For example: dc=vertica,
dc=com generates a unique identifier of the organization, like the corporate Web URL
vertica.com
l Default search path for the organization unit (ou)
For example: ou=sales, ou=engineering
l Search attribute for the user name (uid), common name (cn), and so on
For example, uid=jdoe, cn=Jane Doe
l Binding DN and password for the MC super administrator.
In most cases, you provide the "Bind as administrator" fields, information used to
establish the LDAP service connection for all LDAP operations, like search. Instead
of using the administrator user name and password, the MC administrator could use
his or her own LDAP credentials, as long as that user has search privileges.
If You Choose Bind Anonymously
Unless you specifically configure the LDAP server to deny anonymous binds, the
underlying LDAP protocol will not cause MC's Configure Authentication process to fail if
you choose "Bind anonymously" for the MC administrator. Before you use anonymous
bindings for LDAP authentication on MC, be sure that your LDAP server is configured to
explicitly disable/enable this option. For more information, see the article on Infusion
Technology Solutions and the OpenLDAP documentation on access control.
What Happens Next
Shortly after you click Finish, you should see a status in the browser; however, for
several seconds you might see only an empty page. During this brief period, MC runs as
the local user 'root' long enough to bind to port number 5450. Then MC switches to the
MC super administrator account that you just created, restarts MC, and displays the MC
login page.
Where to Go Next
If you are a new MC user and this is your first MC installation, you might want to
familiarize yourself with MC design. See Management Console in Vertica Concepts.

If you'd rather use MC now, the following following topics in the Administrator's Guide
should help get you started:
If you want to ... See ...
Use the MC interface to install Vertica on a
cluster of hosts
Creating a Cluster Using MC
Create a new, empty Vertica database or
import an existing Vertica database cluster into
the MC interface
Managing Database Clusters
Create new MC users and map them to one or
more Vertica databases that you manage
through the MC interface
Managing Users and Privileges
(About MC Users and About MC
Privileges and Roles)
Monitor MC and one or more MC-managed
Vertica databases
Monitoring Vertica Using
Management Console
Change default port assignments or upload a
new Vertica license or SSL certificate
Managing MC Settings
Compare MC functionality to functionality that
the Administration Tools provides
Administration Tools and
Management Console

Creating a Cluster Using MC
You can use Management Console to install an Vertica cluster on hosts where Vertica
software has not been installed. The Cluster Installation wizard lets you specify the
hosts you want to include in your Vertica cluster, loads the Vertica software onto the
hosts, validates the hosts, and assembles the nodes into a cluster.
Management Console must be installed and configured before you can create a cluster
on targeted hosts. See Installing and Configuring the MC for details.
Steps Required to Install an Vertica Cluster Using MC:
l Install and configure MC
l Prepare the Hosts
l Create the private key file and copy it to your local machine
l Run the Cluster Installation Wizard
l Validate the hosts and create the cluster
l Create a new database on the cluster
Prepare the Hosts
Before you can install an Vertica cluster using the MC, you must prepare each host that
will become a node in the cluster. The cluster creation process runs validation tests
against each host before it attempts to install the Vertica software. These tests ensure
that the host is correctly configured to run Vertica.
Install Perl
The MC cluster installer uses Perl to perform the installation. Install Perl 5 on the target
hosts before performing the cluster installation. Perl is available for download from
www.perl.org.
Validate the Hosts
The validation tests provide:
l Warnings and error messages when they detect a configuration setting that conflicts
with the Vertica requirements or any performance issue
l Suggestions for configuration changes when they detect an issue

Note: The validation tests do not automatically fix all problems they encounter.
All hosts must pass validation before the cluster can be created.
If you accepted the default configuration options when installing the OS on your host,
then the validation tests will likely return errors, since some of the default options used
on Linux systems conflict with Vertica requirements. See Installing Vertica for details on
OS settings. To speed up the validation process you can perform the following steps on
the prospective hosts before you attempt to validate the hosts. These steps are based
on Red Hat Enterprise Linux and CentOS systems, but other supported platforms have
similar settings.
On each host you want to include in the Vertica cluster, you must stage the host
according to Before You Install Vertica.
Create a Private Key File
Before you can install a cluster, Management Console must be able to access the hosts
on which you plan to install Vertica. MC uses password-less SSH to connect to the
hosts and install Vertica software using a private key file.
If you already have a private key file that allows access to all hosts in the potential
cluster, you can use it in the cluster creation wizard.
Note: The private key file is required to complete the MC cluster installation wizard.
Create a Private Key File
1. Log in on the server as root or as a user with sudo privileges.
2. Change to your home directory.
$ cd ~
3. If an .ssh directory does not exist, create one.
$ mkdir .ssh
4. Generate a passwordless private key/public key pair.
$ ssh-keygen -q -t rsa -f ~/.ssh/vid_rsa -N ''

This command creates two files: vid_rsa and vid_rsa.pub. The vid_rsa file is the
private key file that you upload to the MC so that it can access nodes on the cluster
and install Vertica. The vid_rsa.pub file is copied to all other hosts so that they can
be accessed by clients using the vid_rsa file.
5. Make your .ssh directory readable and writable only by yourself.
$ chmod 700 /root/.ssh
6. Change to the .ssh directory.
$ cd ~/.ssh
7. Concatenate the public key into to the file vauthorized_keys2.
$ cat vid_rsa.pub >> vauthorized_keys2
8. If the host from which you are creating the public key will also be in the cluster, then
copy the public key into the local-hosts authorized key file:
cat vid_rsa.pub >> authorized_keys2
9. Make the files in your .ssh directory readable and writable only by yourself.
$ chmod 600 ~/.ssh/*
10. Create the .ssh directory on the other nodes.
$ ssh <host> "mkdir /root/.ssh"
11. Copy the vauthorized key file to the other nodes.
$ scp -r /root/.ssh/vauthorized_keys2 <host>:/root/.ssh/.
12. On each node, concatenate the vauthorized_keys2 public key to the authorized_
keys2 file and make the file readable and writable only by the owner.
$ ssh <host> "cd /root/.ssh/;cat vauthorized_keys2 >> authorized_keys2; chmod 600

/root/.ssh/authorized_keys2"
13. On each node, remove the vauthorized_keys2 file.
$ ssh -i /root/.ssh/vid_rsa <host> "rm /root/.ssh/vauthorized_keys2"
14. Copy the vid_rsa file to the workstation from which you will access the MC cluster
installation wizard. This file is required to install a cluster from the MC.
A complete example of the commands for creating the public key and allowing access to
three hosts from the key is below. The commands are being initiated from the docg01
host, and all hosts will be included in the cluster (docg01 - docg03):
ssh docg01
cd ~/.ssh
ssh-keygen -q -t rsa -f ~/.ssh/vid_rsa -N ''
cat vid_rsa.pub > vauthorized_keys2
cat vid_rsa.pub >> authorized_keys2
chmod 600 ~/.ssh/*
scp -r /root/.ssh/vauthorized_keys2 docg02:/root/.ssh/.
scp -r /root/.ssh/vauthorized_keys2 docg03:/root/.ssh/.
ssh docg02 "cd /root/.ssh/;cat vauthorized_keys2 >> authorized_keys2; chmod 600
ssh docg03 "cd /root/.ssh/;cat vauthorized_keys2 >> authorized_keys2; chmod 600
ssh -i /root/.ssh/vid_rsa docg02 "rm /root/.ssh/vauthorized_keys2"
ssh -i /root/.ssh/vid_rsa docg03 "rm /root/.ssh/vauthorized_keys2"
rm ~/.ssh/vauthorized_keys2
Use the MC Cluster Installation Wizard
The Cluster Installation Wizard guides you through the steps required to install a Vertica
cluster on hosts that do not already have Vertica software installed.
Note: If you are using MC with the Vertica AMI on Amazon Web Services, note that
the Create Cluster and Import Cluster options are not supported.
Prerequisites
Before you proceed, make sure you:
l Installed and configured MC.
l Prepared the hosts that you will include in the Vertica database cluster.
l Created the private key (pem) file and copied it to your local machine.

l Obtained a copy of your Vertica license if you are installing the Premium Edition. If
you are using the Community Edition, a license key is not required.
l Downloaded the Vertica server RPM (or DEB file).
l Have read/copy permissions on files stored on the local browser host that you will
transfer to the host on which MC is installed.
Permissions on Files to Transfer to MC
On your local workstation, you must have at least read/write privileges on files you'll
upload to MC through the Cluster Installation Wizard. These files include the Vertica
server package, the license key (if needed), the private key file, and an optional CSV file
of IP addresses.
Create a New Vertica Cluster Using MC
1. Connect to Management Console and log in as an MC administrator.
2. On MC's Home page, click the Provision Databases task. The Provisioning dialog
appears.
3. Click Create a new Cluster.
4. The Create Cluster wizard opens. Provide the following information:
a. Cluster name—A label for the cluster
b. Vertica Admin User—The user that is created on each of the nodes when they
are installed, typically 'dbadmin'. This user has access to Vertica and is also an
OS user on the host.
c. Password for the Vertica Admin User—The password you enter (required) is set
for each node when MC installs Vertica.
Note: MC does not support an empty password for the administrative user.
d. Vertica Admin Path—Storage location for catalog files, which defaults to
/home/dbadmin unless you specified a different path during MC configuration (or
later on MC's Settings page).

Important: The Vertica Admin Path must be the same as the Linux database
administrator's home directory. If you specify a path that is not the Linux
dbadmin's home directory, MC returns an error.
5. Click Next and specify the private key file and host information:
a. Click Browse and navigate to the private key file (vid_rsa) that you created
earlier.
Note: You can change the private key file at the beginning of the validation
stage by clicking the name of the private key file in the bottom-left corner of
the page. However, you cannot change the private key file after validation
has begun unless the first host fails validation due to an SSH login error.
b. Include the host IP addresses. You have three options:
Specify later (but include number of nodes). This option allows you to specify the
number of nodes, but not the specific IPs. You can specify the specific IPs before
you validate hosts.
Import IP addresses from local file. You can specify the hosts in a CSV file using
either IP addresses or host names.
Enter a range of IP addresses. You can specify a range of IPs to use for new
nodes. For example 192.168.1.10 to 192.168.1.30. The range of IPs must be on
the same or contiguous subnets.
6. Click Next and select the software and license:
a. Vertica Software. If one or more Vertica packages have been uploaded, you can
select one from the list. Otherwise, select Upload a new local vertica binary file
and browse to a Vertica server file on your local system.
b. Vertica License. Click Browse and navigate to a local copy of your Vertica
license if you are installing the Premium Edition. Community Edition versions
require no license key.

7. Click Next. The Create cluster page opens. If you did not specify the IP addresses,
select each host icon and provide an IP address by entering the IP in the box and
clicking Apply for each host you add.
The hosts are now ready for Host Validation and Cluster Creation.
Validate Hosts and Create the Cluster
Host validation is the process where the MC runs tests against each host in a proposed
cluster.
You can validate hosts only after you have completed the cluster installation wizard.
You must validate hosts before the MC can install Vertica on each host.
At any time during the validation process, but before you create the cluster, you can add
and remove hosts by clicking the appropriate button in the upper left corner of the page
on MC. A Create Cluster button appears when all hosts that appear in the node list are
validated.
How to Validate Hosts
To validate one or more hosts:
1. Connect to Management Console and log in as an MC administrator.
2. On the MC Home page, click the Databases and Clusters task.
3. In the list of databases and clusters, select the cluster on which you have recently
run the cluster installation wizard (Creating... appears under the cluster) and click
View.
4. Validate one or several hosts:
n To validate a single host, click the host icon, then click Validate Host.
n To validate all hosts at the same time, click All in the Node List, then click
Validate Host.
n To validate more than one host, but not all of them, Ctrl+click the host numbers in
the node list, then click Validate Host.
5. Wait while validation proceeds.

The validation step takes several minutes to complete. The tests run in parallel for
each host, so the number of hosts does not necessarily increase the amount of time
it takes to validate all the hosts if you validate them at the same time. Hosts
validation results in one of three possible states:
n Green check mark—The host is valid and can be included in the cluster.
n Orange triangle—The host can be added to the cluster, but warnings were
generated. Click the tests in the host validation window to see details about the
warnings.
n Red X—The host is not valid. Click the tests in the host validation window that
have red X's to see details about the errors. You must correct the errors re-
validate or remove the host before MC can create the cluster.
To remove an invalid host: Highlight the host icon or the IP address in the Node
List and click Remove Host.
All hosts must be valid before you can create the cluster. Once all hosts are valid, a
Create Cluster button appears near the top right corner of the page.
How to Create the Cluster
1. Click Create Cluster to install Vertica on each host and assemble the nodes into a
cluster.
The process, done in parallel, takes a few minutes as the software is copied to each
host and installed.
2. Wait for the process to complete. When the Success dialog opens, you can do one
of the following:
n Optionally create a database on the new cluster at this time by clicking Create
Database
n Click Done to create the database at a later time
See Creating a Database on a Cluster for details on creating a database on the new
cluster.

Create a Database on a Cluster
After you use the MC Cluster Installation Wizard to create a Vertica cluster, you can
create a database on that cluster through the MC interface. You can create the database
on all cluster nodes or on a subset of nodes.
If a database had been created using the Administration Tools on any of the nodes, MC
detects (autodiscovers) that database and displays it on the Manage (Cluster
Administration) page so you can import it into the MC interface and begin monitoring it.
MC allows only one database running on a cluster at a time, so you might need to stop a
running database before you can create a new one.
The following procedure describes how to create a database on a cluster that you
created using the MC Cluster Installation Wizard. To create a database on a cluster that
you created by running the install_vertica script, see Creating an Empty Database.
Create a Database on a Cluster
To create a new empty database on a new cluster:
1. If you are already on the Databases and Clusters page, skip to the next step.
Otherwise:
a. Connect to MC and sign in as an MC administrator.
b. On the Home page, click Existing Infrastructure.
2. If no databases exist on the cluster, continue to the next step. Otherwise:
a. If a database is running on the cluster on which you want to add a new database,
select the database and click Stop.
b. Wait for the running database to have a status of Stopped.
3. Click the cluster on which you want to create the new database and click Create
Database.
4. The Create Database wizard opens. Provide the following information:
n Database name and password. See Creating a Database Name and Password
for rules.

n Optionally click Advanced to open the advanced settings and change the port
and catalog, data, and temporary data paths. By default the MC application/web
server port is 5450 and paths are /home/dbadmin, or whatever you defined for
the paths when you ran the cluster creation wizard. Do not use the default agent
port 5444 as a new setting for the MC application/web server port. See MC
Settings > Configuration for port values.
5. Click Continue.
6. Select nodes to include in the database.
The Database Configuration window opens with the options you provided and a
graphical representation of the nodes appears on the page. By default, all nodes are
selected to be part of this database (denoted by a green check mark). You can
optionally click each node and clear Include host in new database to exclude that
node from the database. Excluded nodes are gray. If you change your mind, click
the node and select the Include check box.
7. Click Create in the Database Configuration window to create the database on the
nodes.
The creation process takes a few moments and then the database is started and a
Success message appears.
8. Click OK to close the success message.
The Database Manager page opens and displays the database nodes. Nodes not
included in the database are gray.

After You Install Vertica
The tasks described in this section are optional and are provided for your convenience.
When you have completed this section, proceed to one of the following:
l Using This Guide in Getting Started
l Configuring the Database in the Administrator's Guide
Install the License Key
If you did not supply the -L parameter during setup, or if you did not bypass the -L
parameter for a silent install, the first time you log in as the Database Administrator and
run the Vertica Administration Tools or Management Console, Vertica requires you to
install a license key.
Follow the instructions in Managing Licenses in the Administrator's Guide.
Optionally Install vsql Client Application on Non-Cluster
Hosts
You can use the Vertica vsql executable image on a non-cluster Linux host to connect to
a Vertica database.
l On Red Hat, CentOS, and SUSE systems, you can install the client driver RPM,
which includes the vsql executable. See Installing the Client RPM on Red Hat and
SUSE for details.
l If the non-cluster host is running the same version of Linux as the cluster, copy the
image file to the remote system. For example:
$ scp host01:/opt/vertica/bin/vsql .$ ./vsql
l If the non-cluster host is running a different version of Linux than your cluster hosts,
and that operating system is not Red Hat version 5 64-bit or SUSE 10/11 64-bit, you
must install the Vertica server RPM in order to get vsql. Download the appropriate
rpm package from the Download tab of the myVertica portal then log into the non-
cluster host as root and install the rpm package using the command:
# rpm -Uvh filename

In the above command, filename is the package you downloaded. Note that you do
not have to run the install_Vertica script on the non-cluster host in order to use
vsql.
Notes
l Use the same Command-Line Options that you would on a cluster host.
l You cannot run vsql on a Cygwin bash shell (Windows). Use ssh to connect to a
cluster host, then run vsql.
vsql is also available for additional platforms. See Installing the vsql Client.
Install Vertica Documentation
The latest documentation for your Vertica release is available at
http://my.vertica.com/docs. After you install Vertica, you can optionally install the
documentation on your database server and client systems.
Installing the Vertica Documentation
To install a local copy of the documentation:
1. Open a Web browser and go to http://my.vertica.com/docs.
2. Scroll down to Install documentation locally and save the Vertica documentation
package (.tar.gz or .zip) to your system; for example, to /tmp.
3. Extract the contents using your preferred unzipping application.
4. The home page for the HTML documentation is located at /HTML/index.htm in the
extracted folder.
Installing Client Drivers
After you install Vertica, install drivers on the client systems from which you plan to
access your databases. HPE supplies drivers for ADO.NET, JDBC, ODBC, OLE DB,
Perl, and Python. For instructions on installing these drivers, see Client Drivers in
Connecting to Vertica.
Creating a Database
To get started using Vertica immediately after installation, create a database. You can
use either the Administration Tools or the Management Console.

Creating a Database Using the Administration Tools
Follow these step to create a database using the Administration Tools.
1. Log in as the database administrator, and type admintools to bring up the
Administration Tools.
2. When the EULA (end-user license agreement) window opens, type accept to
proceed. A window displays, requesting the location of the license key file you
downloaded from the HPE Web site. The default path is /tmp/vlicense.dat.
n If you are using the Vertica Community Edition, click OK without entering a license
key.
n If you are using the Vertica Premium Edition, type the absolute path to your license
key (for example, /tmp/vlicense.dat) and click OK.
3. From the Administration Tools Main Menu, click Configuration Menu, and then
click OK.
4. Click Create Database, and click OK to start the database creation wizard.
To create a database using MC, refer Creating a Database Using MC.
See Also
l Using the Vertica Interfaces

Upgrading Vertica
Follow the steps in this section to:
l Upgrade Vertica to a new version.
l Upgrade Management Console.
l Upgrade the client authentication records to the new format.
Upgrading Vertica to a New Version
Requirement Testing
The Version 7.0 installer introduces platform verification tests that prevent the install
from continuing if the platform requirements are not met by your system. Manually verify
that your system meets the requirements in Before You Install Vertica before you update
the server package on your systems. These tests ensure that your platform meets the
hardware and software requirements for Vertica. Previous versions documented these
requirements, but the installer did not verify all of the settings.
Version 7.0 introduces the installation parameter --failure-threshold that allows
you to change the level at which the installer stops the installation process based on the
severity of the failed test. By default, the installer stops on all warnings. You can change
the failure threshold to FAIL to bypass all warnings and only stop on failures. However,
your platform is unsupported until you correct all warnings generated by the installer. By
changing the failure threshold you are able to immediately upgrade and bring up your
Vertica database, but performance cannot be guaranteed until you correct the warnings.
Transaction Catalog Storage
When upgrading from 5.x to a later version of Vertica, due to a change in how
transaction catalog storage works in Vertica 6.0 and later, the amount of space that the
transaction catalog takes up can increase significantly during and after the upgrade.
Verify that you have at least 4 times the size of the Catalog folder in the catalog free (in
addition to normal free space requirements) on your nodes prior to upgrading.
To determine the amount of space the Catalog folder is using, run du -h on the Catalog
folder. Do not run du -h on the entire catalog. Run it specifically on the Catalog folder in
the catalog.
For example:

$ du -h /home/dbadmin/db/v_db_node0001_catalog/Catalog/
Configuration Parameter Storage
As of version 7.1.x, parameter configurations are now stored in the catalog, rather than
in individual vertica.conf files at the node level. If you want to view node-specific
settings prior to upgrading, you can query the CONFIGURATION_PARAMETERS system
table on each node to view parameter values.
When you upgrade to 7.1, Vertica performs the following steps:
1. Backs up current vertica.conf files to vertica-orig.conf files.
2. Chooses the most up-to-date node's configuration parameter settings as the
database-level settings.
3. Stores new database-level values in the catalog.
4. Checks whether the values in all the nodes' vertica.conf files match the
database-level values. If not, Vertica rewrites that node's vertica.conf file to
match database level settings. The previous settings can still be referenced in each
node's vertica-orig.conf files.
If you previously made manual changes to individual vertica.conf files, you can re-
set those node-specific settings using ALTER NODE after you upgrade. You will still be
able to reference the previous values in the vertica-orig.conf files.
Important: Once you upgrade to 7.1, do not hand edit any vertica.conf files.
Additionally, do not use any workarounds for syncing vertica.conf files.
Uninstall HDFS Connector
As of version 7.2., the HDFS Connector is installed automatically. If you have previously
downloaded and installed this connector, uninstall it before you upgrade to this release
of Vertica.
Managing Directed Queries
Directed queries preserve current query plans before a scheduled upgrade. In most
instances, queries perform more efficiently after a Vertica upgrade. In the few cases
where this is not so, you can use directed queries that you created before upgrading to
recreate query plans from the earlier version.
See Directed Queries for more information.

Upgrading Vertica
Follow these steps to upgrade your database. Note that upgrades are incremental and
must follow one of the following upgrade paths:
l Vertica 3.5 to 4.0
l Vertica 6.1 to 7.0. If you have enabled LDAP over SSL/TLS, read Configuring LDAP
Over SSL/TLS When Upgrading Vertica before upgrading.
Important: Hewlett Packard Enterprise strongly recommends that you follow the
upgrade paths. Be sure to read the Release Notes and New Features for each
version you skip. The Vertica documentation is available in the rpm, as well as at
http://my.vertica.com/docs (which also provides access to previous versions of the
documentation).
1. Back up your existing database. This is a precautionary measure so that you can
restore from the backup if the upgrade is unsuccessful.
2. Stop the database using admintools if it is not already stopped. See Stopping a
Database.
3. On each host that you have an additional package installed, such as the R

Language Pack, uninstall the package. For example: rpm -e vertica-R-lang.
Important: If you fail to uninstall Vertica packages prior to upgrading the server
package, then the server package fails to install due to dependencies on the
earlier version of the package.
4. On any host in the cluster, as root or sudo, install the new Vertica Server RPM or
DEB. See Download and Install the Vertica Server Package.
For example:
rpm syntax:
# rpm -Uvh /home/dbadmin/vertica_7.2.x.x86_64.RHEL6.rpm
deb syntax:
# dpkg -i /home/dbadmin/vertica-amd64.deb
Important: If you fail to install the rpm or deb prior to running the next step, then
update_vertica fails with an error due to the conflict between the version of
the update_vertica script and the version of the rpm argument.
5. As root or sudo, run update_vertica. Use the same options that you used
when you last installed or upgraded the database, however, do not use the --
hosts/-s host_list parameter, as the upgrade script automatically determines
the hosts in the cluster. The --hosts/-s host_list parameter must be excluded.
If you forgot the options that were last used, open
/opt/vertica/config/admintools.conf in a text editor and find the line that
starts with install_opts. This line details each option. It is important to use the
same options that were used previously as omitting any options used previously
causes them to revert to their default setting when the upgrade script runs. Also, if
you use different options than originally used, then the update script reconfigures
the cluster to use the new options, which can cause issues with your existing
database.

Installing Vertica with the install_vertica Script provides details on all options
available to the update_vertica script. update_vertica uses the same options
as install_vertica. For example:
RPM package:
# /opt/vertica/sbin/update_vertica --rpm /home/dbadmin/vertica_7.2.x.x86_64.RHEL6.rpm --dba-
user mydba
Debian package:
# /opt/vertica/sbin/update_vertica --deb /home/dbadmin/vertica-amd64.deb --dba-user mydba
Important: The rpm/deb file must be readable by the dbadmin user when
upgrading. Some upgrade scripts are run as the dbadmin user, and that user
must be able to read the rpm/deb file.
6. Start the database. The start-up scripts analyze the database and perform any
necessary data and catalog updates for the new version.
Note: If Vertica issues a warning stating that one or more packages cannot be
installed, run the admintools --force-reinstall option to force reinstallation
of the packages. Refer to Upgrading and Reinstalling Packages for more
information.
7. Perform another backup. When moving from Version 5.0 and earlier to Version 5.1
and later, the backup process changes from using backup.sh to using vbr. You
cannot use an incremental backup between these different versions of backup
scripts. Create a full backup the first time you move to using vbr, and optionally use
incremental backups as you continue to upgrade. However, Vertica recommends
doing full backups each time if disk space and time allows.
8. Continue along the upgrade path and perform these same steps for each version in
your upgrade path.
9. After you have upgraded to the latest version of the server, install any additional
packs you previously removed. See the pack install/upgrade instructions for details

on upgrading the packs. For R, see Installing/Upgrading the R Language Pack for
Vertica.
10. To upgrade the Management Console, see: Upgrading Management Console.
Notes
l Release 5.1 introduced a new backup utility, vbr. This utility replaced both the
backup.sh and restore.sh scripts, making both obsolete. Any backups created
with backup.sh are incompatible with backups created with vbr. Vertica
recommends that you use the current utility vbr as soon as possible after
successfully upgrading from a version prior to Release 5.1 to Release 5.1 or later.
Documentation for the 5.0 scripts remained in the 5.1 documentation. However, the
topics were marked Obsolete in that version and were removed from later versions of
the documentation.
Configuring LDAP Over SSL/TLS When Upgrading Vertica
If you have LDAP enabled over SSL/TLS, in Vertica 7.0, the certificate authentication is
more secure than in previous releases. Before you upgrade to Vertica 7.0, you must
perform several tasks to connect to the LDAP server after the upgrade.
When using SSL/TLS and upgrading to 7.1, note that the SSLCertificate and
SSLPrivateKey parameters are automatically set by Admintools if you set EnableSSL=1
in the previous version.
This section describes the steps you should follow when setting up secure LDAP
authentication on a new installation of Vertica 7.0. The section also includes the
procedure you should follow should you choose to revert to the more permissive
behavior used in Vertica 6.1.
l Using Vertica 7.0 Secure LDAP Authentication
l Using Vertica 6.1 Secure LDAP Authentication
Using Vertica 7.0 Secure LDAP Authentication
If you are a new customer installing Vertica 7.0 and you want to use LDAP over
SSL/TLS, take the following steps on all cluster nodes. You must perform these steps to

configure LDAP authentication:
1. If necessary, modify the LDAP authentication record in your vertica.conf file to
point to the correct server.
2. As the root user, if necessary, create an ldap.conf file and add the following
settings. The TLS_REQCERT option is required. You must include either the TLS_
CACERT or TLS_CADIR option.
TLS_REQCERT hard
TLS_CACERT = /<certificate_path>/CA-cert-bundle.crt
or
TLS_CADIR = <certificate_path>
The options for TLS_REQCERT are:
n hard: If the client does not provide a certificate or provides an invalid certificate,
they cannot connect. This is the default behavior.
n never: The client does not request or check a certificate.
n allow: If the client does not provide a certificate or provides an invalid
certification, they can connect anyway.
n try: If the client does not provide a certificate, the client can connect. If the client
provides an invalid certificate, they cannot connect.
TLS_CACERT specifies the path to the file that contains the certificates.
TLS_CADIR specifies the path to the directory that contains the certificates.
3. Store the ldap.conf file in a location that is readable by DBADMIN. The
DBADMIN must be able to access the ldap.conf file and all path names specified
in the ldap.conf file on all cluster nodes.
4. Set the Linux LDAPCONF environment variable to point to this ldap.conf file.
Make sure this environment variable is set before you start the Vertica software or
you create a database. To ensure that this happens, add a command to the

DBADMIN's profile to set LDAPCONF to point to the ldap.conf file every time you
start the database.
If you start the database using a script like a startup or init file, add steps to the script
that set the LDAPCONF variable to point to the ldap.conf file.
5. Test that LDAP authentication works with and without SSL/TLS. You can use the
ldapsearch tool for this.
6. Repeat steps 1–5 for all cluster nodes.
Using Vertica 6.1 Secure LDAP Authentication
If you have LDAP enabled over SSL/TLS and you want to use the more permissive
LDAP settings used in Vertica 6.1, perform the following tasks on all cluster
nodes.These settings allow Vertica to connect to the LDAP server, even if
authentication fails. You must perform these tasks before you upgrade to Vertica 7.0 and
you must perform them on all cluster nodes:
1. If necessary, modify the LDAP authentication record in your vertica.conf file to
point to the correct server.
2. As the root user, create or modify the ldap.conf file and make the following
changes to ldap.conf:
TLS_REQCERT allow
n hard: If you do not provide a certificate or you provide an invalid certificate, you
cannot connect. This is the default.
n never: The client will not request or check a certificate..
n allow: If you do not provide a certificate, or you provide an invalid certification,
you can connect anyway. This is consistent with the behavior in Vertica 6.1.
n try: If you do not provide a certificate, you can connect. If you provide an invalid
certificate, you cannot connect.

3. Store the ldap.conf file in a location that is readable by DBADMIN. The
DBADMIN must be able to access the ldap.conf file and all path names specified
in the ldap.conf file on all cluster nodes.
4. Set the Linux LDAPCONF environment variable to point to this ldap.conffile.
Make sure this environment variable is set before you start the Vertica software or
you create a database. To ensure that this happens, add a command to the
DBADMIN's Linux profile to set LDAPCONF to point to the ldap.conf file every time
you log in.
5. If you start the database using a script like a startup or init file, add steps to the script
that set the LDAPCONF variable to point to the ldap.conf file.
6. Test that LDAP authentication works with and without SSL/TLS. You can use the
ldapsearch tool for this.
7. Repeat steps 1–5 for all cluster nodes.
Upgrading and Reinstalling Packages
In most scenarios, your Vertica packages are automatically reinstalled when you
upgrade Vertica and restart your database. When Vertica cannot reinstall a package, it
issues a message letting you know that the package reinstallation has failed. You can
then force the package to reinstall by issuing the admintoolsinstall_package
command with the --force-reinstall option.
$ admintools -t install_package -d <database_name> -p <password> -P <package> --force-reinstall
This command prompts Vertica to repair all of the packages that did not reinstall
correctly when you restarted your database after upgrade.
Scenario Requiring Package Reinstallation
You must perform reinstallation when a sequence of events, similar to the following,
occurs:
1. You successfully upgrade Vertica.
2. When you restart your database, you enter an incorrect password.

3. Your database starts, but Vertica issues a message letting you know that one or
more packages could not be reinstalled.
To respond to this scenario:
Use the admintools install_package command and include the --force-reinstall
option. This option forces reinstallation of the packages that failed to reinstall.
admintools Command-Line Options for install_package
When you invoke the admintools install_package command, you have several
options available.
$ admintools -t install_package -h
Command-Line Options:
Option Function
-h
--help
Show this help message and exit.
-d DBNAME
--dbname=DBNAME
Name of database.
-p PASSWORD
--password=PASSWORD
Database administrator password.
-P PACKAGE
--package=PACKAGE
Specify package name.
Valid values:
l all — Reinstall packages available.
l default — Reinstall only those default packages that are
currently installed.
--force-reinstall
Force a package to be reinstalled, even if it is already installed.
Examples
The following example shows how you can use the admintoolsinstall_package
command with the --force-reinstall option to force the reinstallation of default
packages.
$ admintools -t install_package -d VMart -p 'vertica' --package default --force-reinstall

This example shows how you can force the reinstallation of one specific package,
flextable.
$ admintools -t install_package -d VMart -p 'vertica' --package flextable --force-reinstall
This example shows you how to list all the packages on your system.
$ admintools -t list_packages
This example shows you how to uninstall one or more packages. The example
uninstalls the package AWS.
$ admintools -t uninstall_packsage -d VMart -p 'vertica' --package AWS
Upgrading Management Console
If you are moving from Management Console (MC) 6.1.1 to MC 6.1.2, you can install MC
on any Vertica cluster node. This scenario requires a fresh install because HPE does
not provide scripts to migrate metadata (MC users and settings) established in earlier
releases from your existing server to the cluster node. See Installing and Configuring
Management Console.
After you install and configure MC, you will need to recreate MC users you'd created for
your 6.1.1 MC instance, if any, and apply previous MC settings to the new MC version.
Tip: You can export MC-managed database messages and user activity to a
location on the existing server. While you can't import this data, using the exported
files as a reference could help make metadata recreation easier. See Exporting MC-
managed Database Messages and Logs and Exporting the User Audit Log.
If You Keep MC on the Existing Server
If you want to keep MC on the same server (such as on the dedicated server that had
been required in previous MC releases), your MC metadata is retained when you run
the vertica-console installation script.
If you upgrade from Vertica 7.2.0 to 7.2.1, the console.properties file retains the
previous default number of displayed messages. Vertica does not overwrite the
console.properties file during upgrades. To improve performance, update your
console.properties file to set messageCenter.maxEntries from 1000 to a lower value.

Before You Upgrade MC on the Same Server
1. Log in as root or a user with sudo privileges on the server where MC is already
installed.
2. Open a terminal window and shut down the MC process using the following
command:
# /etc/init.d/vertica-consoled stop
For versions of Red Hat 7/CentOS 7 and above, use:
# systemctl stop vertica-consoled
3. Back up MC to preserve configuration metadata. See Backing Up MC .
Upgrade MC on the Same Server
Important: Upgrading MC from version 7.1.2-5 or below on an Vertica host first
requires stopping the database if MC was installed on an Ubuntu or Debian platform.
MC upgrades from version 7.1.2-6 and up do not require stopping the database.
1. Download the MC package (vertica-console-<current-version>.<Linux-
distro>) from myVertica portal and save it to a location on the target server, such
as /tmp.
2. On the target server, log in as root or a user with sudo privileges.
3. Change directory to the location where you saved the MC package.
4. Install MC using your local Linux distribution package management system (for
example, rpm, yum, zipper, apt, dpkg).
The following command is a generic example for Red Hat 6:
# rpm -Uvh vertica-console-<current-version>.x86_64.RHEL6.rpm
The following command is a generic example for Debian and Ubuntu:
# dpkg -i vertica-console-<current-version>.deb

5. If you have stopped the database before upgrading MC, start the database again.
As the root user, use the following command:
/etc/init.d/verticad start
6. Open a browser and enter the IP address or host name of the server on which you
installed MC, as well as the default MC port 5450.
For example, you'll enter one of:
7. When the Configuration Wizard dialog box appears, proceed to Configuring MC.
Upgrading Client Authentication in Vertica
Vertica 7.1.0 changed the storage location for the client authentication records from the
vertica.conf file to the database catalog. When you upgrade to Vertica 7.1.1, the
client authentication records in the vertica.conf file are converted and inserted into
the database catalog. Vertica updates the catalog information on all nodes in the cluster.
Authentication is not enabled after upgrading. As a result, all users can connect to the
database. However, if they have a password, they must enter it.
Take the following steps to make sure that client authentication is configured correctly
and enabled for use with a running database:
1. Review the client authentication methods that Vertica created during the upgrade.
The following system tables contain information about those methods:
n CLIENT_AUTH—Contains information about the client authentication methods that
Vertica created for your database during the upgrade.
n CLIENT_AUTH_PARAMS—Contains information about the parameters that Vertica
defined for the GSS, Ident, and LDAP authentication methods.

n USER_CLIENT_AUTH—Contains information about which authentication methods
are associated with which database users. You associate an authentication
method with a user with the GRANT (Authentication) statement.
2. Review the vertica.log file to see which authentication records Vertica was not
able to create during the upgrade.
3. Create any required new records using CREATE AUTHENTICATION.
4. After the upgrade, enable all the defined authentication methods. You need to enter
an ALTER AUTHENTICATION statement for each method as follows:
=> ALTER AUTHENTICATION auth_method_name ENABLE;
5. If you are using LDAP over SSL/TLS, you must define the new parameters:
n tls_reqcert
n tls_cacert
To do so, use ALTER AUTHENTICATION as follows:
=> ALTER AUTHENTICATION Ldap1 SET host='ldaps://abc.dc.com', binddn_prefix='CN=',
binddn_suffix=',OU=Unit2,DC=dc,DC=com', basedn='dc=DC,dc=com',
tls_cacert='/home/dc.com.ca.cer', starttls='hard', tls_reqcert='never';
6. Create an authentication method (LOCAL TRUST or LOCAL PASSWORD) with a
very high priority, such as, 10,000. Grant this method to the DBADMIN user, and set
the priority using ALTER AUTHENTICATION. For example:
=> CREATE AUTHENTICATION dbadmin_default TRUST LOCAL;
=> ALTER AUTHENTICATION dbadmin_default PRIORITY 10000;
With the high priority, this new authentication method supersedes any
authentication methods you create for PUBLIC. Even if you make changes to
PUBLIC authentication methods, the DBADMIN user can connect to the database at
any time.

Uninstalling Vertica
You can uninstall Vertica and Management Console by running commands at the
command line.
To uninstall Vertica:
1. For each host in the cluster, do the following:
a. Choose a host machine and log in as root (or log in as another user and switch
to root).
$ su - root
password: root-password
b. Find the name of the package that is installed:
# rpm -qa | grep vertica
For deb packages:
# dpkg -l | grep vertica
c. Remove the package:
# rpm -e package
For deb packages:
# dpkg -r package
Note: If you want to delete the configuration file used with your installation, you
can choose to delete the /opt/vertica/ directory and all subdirectories using
this command: # rm -rf /opt/vertica/
2. For each client system, do the following:

a. Delete the JDBC driver jar file.
b. Delete ODBC driver data source names.
c. Delete the ODBC driver software by doing the following:
i. In Windows, go to Start > Control Panel > Add or Remove Programs.
ii. Locate Vertica.
iii. Click Remove.
Uninstalling MC
The uninstall command shuts down Management Console and removes most of the
files that MC installation script installed.
Uninstall MC
1. Log in to the target server as root.
2. Stop Management Console:
# /etc/init.d/vertica-consoled stop
For versions of Red Hat 7/CentOS 7 and above, use:
# systemctl stop vertica-consoled
3. Look for previously-installed versions of MC and note the version:
# rpm -qa | grep vertica
4. Remove the package:
# rpm -e <vertica-console>
Note: If you want to delete the MC directory and all subdirectories, use the following
command: # rm -rf /opt/vconsole
If You Want to Reinstall MC
To re-install MC, see Installing and Configuring Management Console.

Troubleshooting the Vertica Install
The topics described in this section are performed automatically by the install_
vertica script and are described in Installing Vertica. If you did not encounter any
installation problems, proceed to the Administrator's Guide for instructions on how to
configure and operate a database.

Validation Scripts
Vertica provides several validation utilities that can be used prior to deploying Vertica to
help determine if your hosts and network can properly handle the processing and
network traffic required by Vertica. These utilities can also be used if you are
encountering performance issues and need to troubleshoot the issue.
After you install the Vertica RPM, you have access to the following scripts in
/opt/vertica/bin:
l Vcpuperf - a CPU performance test used to verify your CPU performance.
l Vioperf - an Input/Output test used to verify the speed and consistency of your hard
drives.
l Vnetperf - a Network test used to test the latency and throughput of your network
between hosts.
These utilities can be run at any time, but are well suited to use before running the
Vcpuperf
The vcpuperf utility measures your server's CPU processing speed and compares it
against benchmarks for common server CPUs. The utility performs a CPU test and
measures the time it takes to complete the test. The lower the number scored on the test,
the better the performance of the CPU.
The vcpuperf utility also checks the high and low load times to determine if CPU
throttling is enabled. If a server's low-load computation time is significantly longer than
the high-load computation time, CPU throttling may be enabled. CPU throttling is a
power-saving feature. However, CPU throttling can reduce the performance of your
server. Vertica recommends disabling CPU throttling to enhance server performance.
Syntax
vcpuperf [-q]
Option
Option Description
-q
Run in quiet mode. Quiet mode displays only the CPU Time, Real Time, and
high and low load times.

Returns
l CPU Time: the amount of time it took the CPU to run the test.
l Real Time: the total time for the test to execute.
l High load time: The amount of time to run the load test while simulating a high CPU
load.
l Low load time: The amount of time to run the load test while simulating a low CPU
load.
Example
The following example shows a CPU that is running slightly slower than the expected
time on a Xeon 5670 CPU that has CPU throttling enabled.
[root@node1 bin]# /opt/vertica/bin/vcpuperf
Compiled with: 4.1.2 20080704 (Red Hat 4.1.2-52) Expected time on Core 2, 2.53GHz: ~9.5s
Expected time on Nehalem, 2.67GHz: ~9.0s
Expected time on Xeon 5670, 2.93GHz: ~8.0s
This machine's time:
CPU Time: 8.540000s
Real Time:8.710000s
Some machines automatically throttle the CPU to save power.
This test can be done in <100 microseconds (60-70 on Xeon 5670, 2.93GHz).
Low load times much larger than 100-200us or much larger than the corresponding high load time
indicate low-load throttling, which can adversely affect small query / concurrent performance.
This machine's high load time: 67 microseconds.
This machine's low load time: 208 microseconds.
Vioperf
The vioperf utility quickly tests the performance of your host's input and output
subsystem. The utility performs the following tests:
l sequential write
l sequential rewrite
l sequential read
l skip read (read non-contiguous data blocks)
The utility verifies that the host reads the same bytes that it wrote and prints its output to
STDOUT. The utility also logs the output to a JSON formatted file.

Syntax
vioperf [--help] [--duration=<INTERVAL>] [--log-interval=<INTERVAL>] [--log-file=<FILE>] [--condense-
log] [<DIR>*]
Minimum and Recommended IO Performance
l The minimum required I/O is 20 MB/s read/write per physical processor core on each
node, in full duplex (reading and writing) simultaneously, concurrently on all nodes of
the cluster.
l The recommended I/O is 40 MB/s per physical core on each node.
For example, the I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical
cores) is 240 MB/s required minimum, 480 MB/s recommended.
Options
Option Description
--help
Prints a help message and exits.
--duration
The length of time vioprobe runs performance tests. The default is 5
minutes. Specify the interval in seconds, minutes, or hours with any of
these suffixes:
l Seconds: s, sec, secs, second, seconds. Example: --
duration=60sec
l Minutes: m, min, mins, minute, minutes. Example: --
duration=10min
l Hours: h, hr, hrs, hour, hours. Example: --duration=1hrs
--log-interval
The interval at which the log file reports summary information. The
default interval is 10 seconds. This option uses the same interval
notation as --duration.
--log-file
The path and name where log file contents are written, in JSON. If not
specified, then vioperf creates a file named resultsdate-time.JSON
in the current directory.
--condense-log
Directs vioperf to write the log file contents in condensed format, one
JSON entry per line, rather than as indented JSON syntax.

Option Description
<DIR>
Zero or more directories to test. If you do not specify a directory,
vioperf tests the current directory. To test the performance of each
disk, specify different directories mounted on different disks.
Returns
The utility returns the following information:
Heading Description
test
The test being run (Write, ReWrite, Read, or Skip Read)
directory The directory in which the test is being run.
counter name
The counter type of the test being run. Can be either
MB/s or Seeks per second.
counter value
The value of the counter in MB/s or Seeks per second
across all threads. This measurement represents the
bandwidth at the exact time of measurement. Contrast
with counter value (avg).
counter value (10 sec avg)
The average amount of data in MB/s, or the average
number of Seeks per second, for the test being run in the
duration specified with --log-interval. The default
interval is 10 seconds. The counter value (avg) is
the average bandwidth since the last log message,
across all threads.
counter value/core
The counter value divided by the number of cores.
counter value/core (10 sec avg)
The counter value (10 sec avg) divided by the
number of cores.
thread count
The number of threads used to run the test.
%CPU
The available CPU percentage used during this test.
%IO Wait
The CPU percentage in I/O Wait state during this test. I/O
wait state is the time working processes are blocked
while waiting for I/O operations to complete.

Heading Description
elapsed time
The amount of time taken for a particular test. If you run
the test multiple times, elapsed time increases the next
time the test is run.
remaining time
The time remaining until the next test. Based on the --
duration option, each of the tests is run at least once. If
the test set is run multiple times, then remaining time
is how much longer the test will run. The remaining
time value is cumulative. Its total is added to elapsed
time each time the same test is run again.
Example
Invoking vioperf from a terminal outputs the following message and sample results:
[dbadmin@node01 ~]$ /opt/vertica/bin/vioperf --duration=60s
The minimum required I/O is 20 MB/s read and write per physical processor core on each node, in
full duplex
i.e. reading and writing at this rate simultaneously, concurrently on all nodes of the cluster.
The recommended I/O is 40 MB/s per physical core on each node.
For example, the I/O rate for a server node with 2 hyper-threaded six-core CPUs is 240 MB/s
required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/home/dbadmin"
test | directory | counter name | counter value
| counter value (10 sec avg) | counter value/core | counter value/core (10 sec avg) |
thread count | %CPU | %IO Wait | elapsed time (s)| remaining time (s)
---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------
--------
Write | /home/dbadmin | MB/s | 420
| 420 | 210 | 210 | 2
| 89 | 10 | 10 | 5
Write | /home/dbadmin | MB/s | 412
| 396 | 206 | 198 | 2
| 89 | 9 | 15 | 0
ReWrite | /home/dbadmin | (MB-read+MB-write)/s| 150+150
| 150+150 | 75+75 | 75+75 | 2
| 58 | 40 | 10 | 5
ReWrite | /home/dbadmin | (MB-read+MB-write)/s| 158+158
| 172+172 | 79+79 | 86+86 | 2
| 64 | 33 | 15 | 0
Read | /home/dbadmin | MB/s | 194
| 194 | 97 | 97 | 2
| 69 | 26 | 10 | 5
Read | /home/dbadmin | MB/s | 192
| 190 | 96 | 95 | 2
| 71 | 27 | 15 | 0
SkipRead | /home/dbadmin | seeks/s | 659
| 659 | 329.5 | 329.5 | 2

| 2 | 85 | 10 | 5
SkipRead | /home/dbadmin | seeks/s | 677
| 714 | 338.5 | 357 | 2
| 2 | 59 | 15 | 0
Note: When evaluating performance for minimum and recommended I/O, include
the Write and Read values in your evaluation. ReWrite and SkipRead values are not
relevant to determining minimum and recommended I/O.
Vnetperf
The vnetperf utility allows you to measure the network performance of your hosts. It can
measure network latency and the throughput for both the TCP and UDP protocols.
Important: This utility introduces a high network load and must not be used on a
running Vertica cluster or database performance is degraded.
Using this utility you can detect:
l if throughput is low for all hosts or a particular host,
l if latency is high for all hosts or a particular host,
l bottlenecks between one or more hosts or subnets,
l too low a limit in the number of TCP connections that can be established
simultaneously,
l and if there is a high rate of packet loss on the network.
The latency test measures the latency from the host running the script to the other hosts.
Any host that has a particularly high latency should be investigated further.
The throughput tests measure both UDP and TCP throughput. You can specify a rate
limit in MB/s to use for these tests, or allow the utility to use a range of throughputs to be
used.
Syntax
vnetperf [options] [tests]

Recommended Network Performance
l The maximum recommended RTT (round-trip time) latency is 1000 microseconds.
The ideal RTT latency is 200 microseconds or less. Vertica recommends that clock
skew be kept to under 1 second.
l The minimum recommended throughput is 100MB/s. Ideal throughput is 800 MB/s or
more.
Note: UDP numbers may be lower, multiple network switches may reduce
performance results.
Options
Option Description
--condense
Condense the log into one JSON entry per line, instead of
indented JSON syntax.
--collect-logs
Collect the test log files from each host.
--datarate rate
Limit the throughput to this rate in MB/s. A rate of 0 loops the
tests through several different rates. The default is 0.
--duration seconds
The time limit for each test to run in seconds. The default is
1.
--hosts host1,host2,...
A comma-separated list of hosts on which to run the tests.
Do not use spaces between the comma's and the host
names.
--hosts file
A hosts file that specifies the hosts on which to run the tests.
If the --hosts argument is not used, then the utility attempts to
access admintools and determine the hosts in the cluster.
--identity-file file
If using passwordless SSH/SCP access between the hosts,
then specify the key file used to gain access to the hosts.
--ignore-bad-hosts
If set, run the tests on the reachable hosts even if some hosts
are not reachable. If not set, and a host is unreachable, then
no tests are run on any hosts.

Option Description
--log-dir directory
If --collect-logs is set, the directory in which to place the
collected logs. The default directory is named
logs.netperf.<timestamp>
--log-level LEVEL
The log level to use. Possible values are: INFO, ERROR,
DEBUG, and WARN. The default is WARN.
--list-tests
Lists the tests that can be run by this utility.
--output-file file
The file that JSON results are written to. The default is
results.<timestamp>.json.
--ports port1,port2,port3
The port numbers to use. If only one is specified then the
next two numbers in sequence are also used. The default
ports are 14159,14160, 14161.
--scp-options 'options'
Using this argument, you can specify one or more standard
SCP command line arguments enclosed in single quotes.
SCP is used to copy test binaries over to the target hosts.
--ssh-options 'options'
Using this argument, you can specify one or more standard
SSH command line arguments enclose in single quotes.
SSH is used to issue test commands on the target hosts.
--vertica-install directory
If specified, then the utility assumes Vertica is installed on
each of the hosts and to use the test binaries on the target
system rather than copying them over using SCP.
Tests
Note: If the tests argument is omitted then all tests are run.
Test Description
latency
Test the latency to each of the hosts.
tcp-throughput
Test the TCP throughput amongst the hosts.
udp-throughput
Test the UDP throughput amongst the hosts.

Returns
For each host it returns the following:
Latency test returns:
l The Round Trip Time (rtt) latency for each host in milliseconds.
l Clock Skew = the difference in time shown by the clock on the target host relative to
the host running the utility.
UDP and TCP throughput tests return:
l The date/time and test name.
l The rate limit in MB/s.
l The node being tested.
l Sent and Received data in MB/s and bytes.
l The duration of the test in seconds.
Example
/opt/vertica/bin/vnetperf --condense -hosts 10.20.100.66,10.20.100.67 --identity-file
'/root/.ssh/vid_rsa'
Enable Secure Shell (SSH) Logins
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all
hosts without specifying a password. The shell script install_vertica does this
automatically. This section describes how to do it manually if necessary.
1. If you do not already have SSH installed on all hosts, log in as root on each host
and install it now. You can download a free version of the SSH connectivity tools
from OpenSSH.
2. Log in to the Vertica administrator account (dbadmin in this example).
3. Make your home directory (~) writable only by yourself. Choose one of:
$ chmod 700 ~

or
$ chmod 755 ~
where:
700 includes 755 includes
400 read by owner
200 write by owner
100 execute by owner
400 read by owner
200 write by owner
100 execute by owner
040 read by group
010 execute by group
004 read by anybody (other)
001 execute by anybody
4. Change to your home directory:
$ cd ~
5. Generate a private key/ public key pair:
$ ssh-keygen -t rsaGenerating public/private rsa key pair.
Enter file in which to save the key (/home/dbadmin/.ssh/id_rsa):
Created directory '/home/dbadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/dbadmin/.ssh/id_rsa.
Your public key has been saved in /home/dbadmin/.ssh/id_rsa.pub.
6. Make your .ssh directory readable and writable only by yourself:
$ chmod 700 ~/.ssh
7. Change to the .ssh directory:

$ cd ~/.ssh
8. Copy the file id_rsa.pub onto the file authorized_keys2.
$ cp id_rsa.pub authorized_keys2
9. Make the files in your .ssh directory readable and writable only by yourself:
$ chmod 600 ~/.ssh/*
10. For each cluster host:
$ scp -r ~/.ssh <host>:.
11. Connect to each cluster host. The first time you ssh to a new remote machine, you
could get a message similar to the following:
$ ssh dev0 Warning: Permanently added 'dev0,192.168.1.92' (RSA) to the list of known hosts.
This message appears only the first time you ssh to a particular remote host.
See Also
l OpenSSH

Migration Guide for Red Hat 7/CentOS 7
If you installed Vertica 7.0 on a Red Hat 6/CentOS 6 system, you might possibly want to
transfer your existing data to a Red Hat 7/CentOS 7 environment. Be aware that a direct
upgrade of the current operating system with Vertica 7.0 installed is not supported.
Transferring your existing data to a Red Hat 7/CentOS 7 environment requires
migration. The only supported method of migration is from a Red Hat 6/CentOS 6
system running Vertica 7.0 to a Red Hat 7/CentOS 7 system running Vertica 7.0.
To perform this migration, you must complete two steps:
1. Create a new, empty database on a cluster of hosts or a single host running Red
Hat 7/CentOS 7. Vertica 7.0 must be installed on any new host.
2. Transfer your data from the current cluster, running Red Hat 6/CentOS 6, to the new
single host or cluster of hosts, running Red Hat 7/CentOS 7.
The following diagram provides a high-level overview of the entire migration process:

When you install Vertica 7.0 on a Red Hat 7/CentOS 7 target host, follow the
instructions in Installing Vertica. The Installing Vertica provides all necessary
configurations for installing Vertica on Red Hat 7/CentOS 7.
You can use any of three methods to transfer your current database data to a Red Hat
7/CentOS 7 environment. Before you begin the migration process, review each of the
three available options. Each method has its own particular benefits and also
constraints. Choose the one that best suits your needs.
The three methods are:

l Copying the Database to Another Cluster (also known as Copycluster)
l Copying and Exporting Data (also known as Import/Export)
l Backing Up and Restoring the Database
For a comprehensive guide to major changes and migration considerations in Red Hat
7/CentOS 7, see the Red Hat 7 Migration Planning Guide on the Red Hat website.

Appendix: Time Zones
Using Time Zones With Vertica
Vertica uses the TZ environment variable on each node, if it has been set, for the default
current time zone. Otherwise, Vertica uses the operating system time zone.
The TZ variable can be set by the operating system during login (see /etc/profile,
/etc/profile.d, or /etc/bashrc) or by the user in .profile, .bashrc or .bash-
profile.
TZ must be set to the same value on each node when you start Vertica.
The following command returns the current time zone for your database:
=> SHOW TIMEZONE; name | setting
----------+------------------
timezone | America/New_York
(1 row)
You can also use the SET TIMEZONE TO { value | 'value' } command to set the time
zone for a single session.
There is no database default time zone; instead, TIMESTAMP WITH TIMEZONE
(TIMESTAMPTZ) data is stored in GMT (UTC) by converting data from the current local
time zone to GMT.
When TIMESTAMPTZ data is used, data is converted back to use the current local time
zone, which might be different from the local time zone where the data was stored. This
conversion takes into account Daylight Saving Time (Summer Time), if applicable,
depending on the year and date, to know when the Daylight Saving Time change
occurred.
TIMESTAMP WITHOUT TIMEZONE data stores the timestamp, as given, and retrieves
it exactly as given. The current time zone is ignored. The same is true for TIME
WITHOUT TIMEZONE. For TIME WITH TIMEZONE (TIMETZ), however, the current
time zone setting is stored along with the given time, and that time zone is used on
retrieval.
Note: HPE recommends that you use TIMESTAMPTZ, not TIMETZ.
TIMESTAMPTZ uses the current time zone on both input and output, such as in the
following example:

=> CREATE TEMP TABLE s (tstz TIMESTAMPTZ);=> SET TIMEZONE TO 'America/New_York';
=> INSERT INTO s VALUES ('2009-02-01 00:00:00');
=> INSERT INTO s VALUES ('2009-05-12 12:00:00');
=> SELECT tstz AS 'Local timezone', tstz AT TIMEZONE 'America/New_York' AS 'America/New_York',
tstz AT TIMEZONE 'GMT' AS 'GMT' FROM s;
Local timezone | America/New_York | GMT
------------------------+---------------------+---------------------
2009-02-01 00:00:00-05 | 2009-02-01 00:00:00 | 2009-02-01 05:00:00
2009-05-12 12:00:00-04 | 2009-05-12 12:00:00 | 2009-05-12 16:00:00
(2 rows)
The -05 in the Local time zone column above shows that the data is displayed in EST,
while -04 indicates EDT. The other two columns show the TIMESTAMP WITHOUT
TIMEZONE at the specified time zone.
The next example illustrates what occurs if the current time zone is changed to, for
example, Greenwich Mean Time:
=> SET TIMEZONE TO 'GMT';=> SELECT tstz AS 'Local timezone', tstz AT TIMEZONE 'America/New_York' AS
'America/New_York', tstz AT TIMEZONE 'GMT' as 'GMT' FROM s;
Local timezone | America/New_York | GMT
------------------------+---------------------+---------------------
2009-02-01 05:00:00+00 | 2009-02-01 00:00:00 | 2009-02-01 05:00:00
2009-05-12 16:00:00+00 | 2009-05-12 12:00:00 | 2009-05-12 16:00:00
(2 rows)
The +00 in the Local time zone column above indicates that TIMESTAMPTZ is
displayed in 'GMT'.
The approach of using TIMESTAMPTZ fields to record events captures the GMT of the
event, as expressed in terms of the local time zone. Later, it allows for easy conversion
to any other time zone, either by setting the local time zone or by specifying an explicit
AT TIMEZONE clause.
The following example shows how TIMESTAMP WITHOUT TIMEZONE fields work in
Vertica.
=> CREATE TEMP TABLE tnoz (ts TIMESTAMP);=> INSERT INTO tnoz VALUES('2009-02-01 00:00:00');
=> INSERT INTO tnoz VALUES('2009-05-12 12:00:00');
=> SET TIMEZONE TO 'GMT';
=> SELECT ts AS 'No timezone', ts AT TIMEZONE 'America/New_York' AS
'America/New_York', ts AT TIMEZONE 'GMT' AS 'GMT' FROM tnoz;
No timezone | America/New_York | GMT
---------------------+------------------------+------------------------
2009-02-01 00:00:00 | 2009-02-01 05:00:00+00 | 2009-02-01 00:00:00+00
2009-05-12 12:00:00 | 2009-05-12 16:00:00+00 | 2009-05-12 12:00:00+00
(2 rows)
The +00 at the end of a timestamp indicates that the setting is TIMESTAMP WITH
TIMEZONE in GMT (the current time zone). The 'America/New_York' column shows

what the 'GMT' setting was when you recorded the time, assuming you read a normal
clock in the time zone 'America/New_York'. What this shows is that if it is midnight in the
'America/New_York' time zone, then it is 5 am GMT.
Note: 00:00:00 Sunday February 1, 2009 in America/New_York converts to
05:00:00 Sunday February 1, 2009 in GMT.
The 'GMT' column displays the GMT time, assuming the input data was captured in
GMT.
If you don't set the time zone to GMT, and you use another time zone, for example
'America/New_York', then the results display in 'America/New_York' with a -05 and -04,
showing the difference between that time zone and GMT.
=> SET TIMEZONE TO 'America/New_York';=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
=> SELECT ts AS 'No timezone', ts AT TIMEZONE 'America/New_York' AS
'America/New_York', ts AT TIMEZONE 'GMT' AS 'GMT' FROM tnoz;
No timezone | America/New_York | GMT
---------------------+------------------------+------------------------
2009-02-01 00:00:00 | 2009-02-01 00:00:00-05 | 2009-01-31 19:00:00-05
2009-05-12 12:00:00 | 2009-05-12 12:00:00-04 | 2009-05-12 08:00:00-04
(2 rows)
In this case, the last column is interesting in that it returns the time in New York, given
that the data was captured in 'GMT'.
See Also
l TZ Environment Variable
l SET TIME ZONE
l Date/Time Data Types
Africa
Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers

Africa/Asmera
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
Africa/Blantyre
Africa/Brazzaville
Africa/Bujumbura
Africa/Cairo Egypt
Africa/Casablanca
Africa/Ceuta
Africa/Conakry
Africa/Dakar
Africa/Dar_es_Salaam
Africa/Djibouti
Africa/Douala
Africa/El_Aaiun
Africa/Freetown
Africa/Gaborone
Africa/Harare
Africa/Johannesburg
Africa/Kampala
Africa/Khartoum

Africa/Kigali
Africa/Kinshasa
Africa/Lagos
Africa/Libreville
Africa/Lome
Africa/Luanda
Africa/Lubumbashi
Africa/Lusaka
Africa/Malabo
Africa/Maputo
Africa/Maseru
Africa/Mbabane
Africa/Mogadishu
Africa/Monrovia
Africa/Nairobi
Africa/Ndjamena
Africa/Niamey
Africa/Nouakchott
Africa/Ouagadougou
Africa/Porto-Novo
Africa/Sao_Tome
Africa/Timbuktu
Africa/Tripoli Libya

Africa/Tunis
Africa/Windhoek
America
America/Adak America/Atka US/Aleutian
America/Anchorage SystemV/YST9YDT US/Alaska
America/Anguilla
America/Antigua
America/Araguaina
America/Aruba
America/Asuncion
America/Bahia
America/Barbados
America/Belem
America/Belize
America/Boa_Vista
America/Bogota
America/Boise
America/Buenos_Aires
America/Cambridge_Bay
America/Campo_Grande
America/Cancun
America/Caracas

America/Catamarca
America/Cayenne
America/Cayman
America/Chicago CST6CDT SystemV/CST6CDT US/Central
America/Chihuahua
America/Cordoba America/Rosario
America/Costa_Rica
America/Cuiaba
America/Curacao
America/Danmarkshavn
America/Dawson
America/Dawson_Creek
America/Denver MST7MDT SystemV/MST7MDT US/Mountain America/Shiprock
Navajo
America/Detroit US/Michigan
America/Dominica
America/Edmonton Canada/Mountain
America/Eirunepe
America/El_Salvador
America/Ensenada America/Tijuana Mexico/BajaNorte
America/Fortaleza
America/Glace_Bay
America/Godthab

America/Goose_Bay
America/Grand_Turk
America/Grenada
America/Guadeloupe
America/Guatemala
America/Guayaquil
America/Guyana
America/Halifax Canada/Atlantic SystemV/AST4ADT
America/Havana Cuba
America/Hermosillo
America/Indiana/Indianapolis
America/Indianapolis
America/Fort_Wayne EST SystemV/EST5 US/East-Indiana
America/Indiana/Knox America/Knox_IN US/Indiana-Starke
America/Indiana/Marengo
America/Indiana/Vevay
America/Inuvik
America/Iqaluit
America/Jamaica Jamaica
America/Jujuy
America/Juneau
America/Kentucky/Louisville America/Louisville
America/Kentucky/Monticello

America/La_Paz
America/Lima
America/Los_Angeles PST8PDT SystemV/PST8PDT US/Pacific US/Pacific- New
America/Maceio
America/Managua
America/Manaus Brazil/West
America/Martinique
America/Mazatlan Mexico/BajaSur
America/Mendoza
America/Menominee
America/Merida
America/Mexico_City Mexico/General
America/Miquelon
America/Monterrey
America/Montevideo
America/Montreal
America/Montserrat
America/Nassau
America/New_York EST5EDT SystemV/EST5EDT US/Eastern
America/Nipigon
America/Nome
America/Noronha Brazil/DeNoronha
America/North_Dakota/Center

America/Panama
America/Pangnirtung
America/Paramaribo
America/Phoenix MST SystemV/MST7 US/Arizona
America/Port-au-Prince
America/Port_of_Spain
America/Porto_Acre America/Rio_Branco Brazil/Acre
America/Porto_Velho
America/Puerto_Rico SystemV/AST4
America/Rainy_River
America/Rankin_Inlet
America/Recife
America/Regina Canada/East-Saskatchewan Canada/Saskatchewan
SystemV/CST6
America/Santiago Chile/Continental
America/Santo_Domingo
America/Sao_Paulo Brazil/East
America/Scoresbysund
America/St_Johns Canada/Newfoundland
America/St_Kitts
America/St_Lucia
America/St_Thomas America/Virgin
America/St_Vincent

America/Swift_Current
America/Tegucigalpa
America/Thule
America/Thunder_Bay
America/Toronto Canada/Eastern
America/Tortola
America/Vancouver Canada/Pacific
America/Whitehorse Canada/Yukon
America/Winnipeg Canada/Central
America/Yakutat
America/Yellowknife
Antarctica
Antarctica/Casey
Antarctica/Davis
Antarctica/DumontDUrville
Antarctica/Mawson
Antarctica/McMurdo
Antarctica/South_Pole
Antarctica/Palmer
Antarctica/Rothera
Antarctica/Syowa
Antarctica/Vostok

Asia
Asia/Aden
Asia/Almaty
Asia/Amman
Asia/Anadyr
Asia/Aqtau
Asia/Aqtobe
Asia/Ashgabat Asia/Ashkhabad
Asia/Baghdad
Asia/Bahrain
Asia/Baku
Asia/Bangkok
Asia/Beirut
Asia/Bishkek
Asia/Brunei
Asia/Calcutta
Asia/Choibalsan
Asia/Chongqing Asia/Chungking
Asia/Colombo
Asia/Dacca Asia/Dhaka
Asia/Damascus
Asia/Dili

Asia/Dubai
Asia/Dushanbe
Asia/Gaza
Asia/Harbin
Asia/Hong_Kong Hongkong
Asia/Hovd
Asia/Irkutsk
Asia/Jakarta
Asia/Jayapura
Asia/Jerusalem Asia/Tel_Aviv Israel
Asia/Kabul
Asia/Kamchatka
Asia/Karachi
Asia/Kashgar
Asia/Katmandu
Asia/Krasnoyarsk
Asia/Kuala_Lumpur
Asia/Kuching
Asia/Kuwait
Asia/Macao Asia/Macau
Asia/Magadan
Asia/Makassar Asia/Ujung_Pandang
Asia/Manila

Asia/Muscat
Asia/Nicosia Europe/Nicosia
Asia/Novosibirsk
Asia/Omsk
Asia/Oral
Asia/Phnom_Penh
Asia/Pontianak
Asia/Pyongyang
Asia/Qatar
Asia/Qyzylorda
Asia/Rangoon
Asia/Riyadh
Asia/Riyadh87 Mideast/Riyadh87
Asia/Saigon
Asia/Sakhalin
Asia/Samarkand
Asia/Seoul ROK
Asia/Shanghai PRC
Asia/Singapore Singapore
Asia/Taipei ROC
Asia/Tashkent

Asia/Tbilisi
Asia/Tehran Iran
Asia/Thimbu Asia/Thimphu
Asia/Tokyo Japan
Asia/Ulaanbaatar Asia/Ulan_Bator
Asia/Urumqi
Asia/Vientiane
Asia/Vladivostok
Asia/Yakutsk
Asia/Yekaterinburg
Asia/Yerevan
Atlantic
Atlantic/Azores
Atlantic/Bermuda
Atlantic/Canary
Atlantic/Cape_Verde
Atlantic/Faeroe
Atlantic/Madeira
Atlantic/Reykjavik Iceland
Atlantic/South_Georgia
Atlantic/St_Helena
Atlantic/Stanley

Australia
Australia/ACT
Australia/Canberra
Australia/NSW
Australia/Sydney
Australia/Adelaide
Australia/South
Australia/Brisbane
Australia/Queensland
Australia/Broken_Hill
Australia/Yancowinna
Australia/Darwin
Australia/North
Australia/Hobart
Australia/Tasmania
Australia/LHI
Australia/Lord_Howe
Australia/Lindeman
Australia/Melbourne
Australia/Victoria
Australia/Perth Australia/West
Etc/GMT
Etc/GMT+1
Etc/GMT+2

Etc/GMT+3
Etc/GMT+4
Etc/GMT+5
Etc/GMT+6
Etc/GMT+7
Etc/GMT+8
Etc/GMT+9
Etc/GMT+10
Etc/GMT+11
Etc/GMT+12
Etc/GMT-1
Etc/GMT-2
Etc/GMT-3
Etc/GMT-4
Etc/GMT-5
Etc/GMT-6
Etc/GMT-7
Etc/GMT-8
Etc/GMT-9
Etc/GMT-10
Etc/GMT-11
Etc/GMT-12
Etc/GMT-13

Etc/GMT-14
Europe
Europe/Amsterdam
Europe/Andorra
Europe/Athens
Europe/Belfast
Europe/Belgrade
Europe/Ljubljana
Europe/Sarajevo
Europe/Skopje
Europe/Zagreb
Europe/Berlin
Europe/Brussels
Europe/Bucharest
Europe/Budapest
Europe/Chisinau Europe/Tiraspol
Europe/Copenhagen
Europe/Dublin Eire
Europe/Gibraltar
Europe/Helsinki
Europe/Istanbul Asia/Istanbul Turkey
Europe/Kaliningrad
Europe/Kiev

Europe/Lisbon Portugal
Europe/London GB GB-Eire
Europe/Luxembourg
Europe/Madrid
Europe/Malta
Europe/Minsk
Europe/Monaco
Europe/Moscow W-SU
Europe/Oslo
Arctic/Longyearbyen
Atlantic/Jan_Mayen
Europe/Paris
Europe/Prague Europe/Bratislava
Europe/Riga
Europe/Rome Europe/San_Marino Europe/Vatican
Europe/Samara
Europe/Simferopol
Europe/Sofia
Europe/Stockholm
Europe/Tallinn
Europe/Tirane
Europe/Uzhgorod
Europe/Vaduz

Europe/Vienna
Europe/Vilnius
Europe/Warsaw Poland
Europe/Zaporozhye
Europe/Zurich
Indian
Indian/Antananarivo
Indian/Chagos
Indian/Christmas
Indian/Cocos
Indian/Comoro
Indian/Kerguelen
Indian/Mahe
Indian/Maldives
Indian/Mauritius
Indian/Mayotte
Indian/Reunion
Pacific
Pacific/Apia
Pacific/Auckland NZ
Pacific/Chatham NZ-CHAT
Pacific/Easter

Chile/EasterIsland
Pacific/Efate
Pacific/Enderbury
Pacific/Fakaofo
Pacific/Fiji
Pacific/Funafuti
Pacific/Galapagos
Pacific/Gambier SystemV/YST9
Pacific/Guadalcanal
Pacific/Guam
Pacific/Honolulu HST SystemV/HST10 US/Hawaii
Pacific/Johnston
Pacific/Kiritimati
Pacific/Kosrae
Pacific/Kwajalein Kwajalein
Pacific/Majuro
Pacific/Marquesas
Pacific/Midway
Pacific/Nauru
Pacific/Niue
Pacific/Norfolk
Pacific/Noumea
Pacific/Pago_Pago

Pacific/Samoa US/Samoa
Pacific/Palau
Pacific/Pitcairn SystemV/PST8
Pacific/Ponape
Pacific/Port_Moresby
Pacific/Rarotonga
Pacific/Saipan
Pacific/Tahiti
Pacific/Tarawa
Pacific/Tongatapu
Pacific/Truk
Pacific/Wake
Pacific/Wallis
Pacific/Yap

Getting Started

Using This Guide
Getting Started shows how to set up a Vertica database and run simple queries that
perform common database tasks.
Who Should Use This Guide?
Getting Started targets anyone who wants to learn how to create and run a Vertica
database. This guide requires no special knowledge at this point, although a
rudimentary knowledge of basic SQL commands is useful when you begin to run
queries.
What You Need
The examples in this guide require one of the following:
l Vertica installed on one host or a cluster of hosts. Vertica recommends a minimum of
three hosts in the cluster.
l Vertica installed on a virtual machine (VM).
For further instructions about installation, see Installing Vertica.
Accessing Your Database
You access your database with an SSH client or the terminal utility in your Linux
console, such as vsql. Throughout this guide, you use the following user interfaces:
l Linux command line (shell) interface
l Vertica Administration Tools
l vsql client interface
l Vertica Management Console
Downloading and Starting the Virtual Machine
Vertica is available as a Virtual Machine (VM) that is pre-installed on a 64-bit Red Hat
Enterprise Linux image and comes with a license for 500 GB of data storage.
The VM image is preconfigured with the following hardware settings:

l 1 CPU
l 1024 MB RAM
l 50 GB Hard Disk (SCSI, not preallocated, single file storage)
l Bridged Networking
Downloading a VM
The Vertica VM is available both as an OVF template (for VMWare vSphere 4.0) and as
a VMDK file (for VMWare Server 2.0 and VMWare Workstation 7.0). Download and
install the appropriate file for your VMWare deployment from the myVertica portal at
https://my.vertica.com/downloads (registration required).
Starting the VM
1. Open the appropriate Vertica VM image file in VMWare. For example, open the
VMX file if you are using VMWare Workstation, or the OVF template if you are using
VMWare vSphere.
2. Navigate to the settings for the VM image and adjust the network settings so that
they are compatible with your VM.
3. Start the VM. For example, in VMWare Workstation, select VM > Power > Power
On.
Checking for Vertica Updates
The VM image might not include the latest available Vertica release. After you install
and start your VM, verify the version of Vertica with the following command.
$ rpm –qa | grep vertica
The RPM package name that the command returns contains the version and build
numbers. If there is a later version of Vertica, download it from the myVertica portal at
https://my.vertica.com/downloads (registration required). Upgrade instructions are
provided in Installing Vertica.
Types of Database Users
Every Vertica database has one or more users. When users connect to a database, they
must log on with valid credentials (username and password) that a database

administrator defines.
Database users own the objects they create in a database, such as tables, procedures,
and storage locations. By default, all users have the right to create temporary tables in a
database.
In an Vertica database, there are three types of users:
l Database administrator (dbadmin)
l Object owner
l Everyone else (PUBLIC)
dbadmin User
When you create a new database, a single database administrator account, dbadmin, is
automatically created along with a PUBLIC role. The database administrator bypasses
all permission checks and has the authority to perform all database operations, such as
bypassing all GRANT/REVOKE authorizations and any user granted
PSEUDOSUPERUSER role.
Note: Although the dbadmin user has the same name as the Linux database
administrator account, do not confuse the concept of a database administrator with a
Linux superuser (root) privilege; they are not the same. A database administrator
cannot have Linux superuser privileges.
Object Owner
An object owner is the user who creates a particular database object; the owner can
perform any operation on that object. By default, only an owner or a database
administrator can act on a database object. In order to allow other users to use an
object, the owner or database administrator must grant privileges to those users using
one of the GRANT statements. Object owners are PUBLIC users for objects that other
users own.
PUBLIC User
All non- administrator and non-object owners are PUBLIC users. Newly created users
do not have access to schema PUBLIC by default. Make sure to GRANT USAGE ON
SCHEMA PUBLIC to all users you create.

Logging in as dbadmin
The first time you boot the VM, a login screen appears. Select the Vertica DBA user, and
then enter the default password. Vertica DBA is the full name of the dbadmin user. After
you log in, a web page displays further instructions.
The default username and password are as follows:
l Username: dbadmin
l Password: password
l Root Password: password
Important: The dbadmin user has sudo privileges. Be sure to change the dbadmin
and root passwords with the Linux passwrd command.
Using the Vertica Interfaces
Vertica provides a set of tools that allows you to perform administrative tasks quickly
and easily. The administration tasks in Vertica can be done using the Management
Console (MC) or the Administration Tools. The MC provides a unified view of your
Vertica cluster through a browser connection, while the Administration Tools are
implemented using Dialog, a graphical user interface that works in terminal (character-
cell) windows.
Using Management Console
The Management Console provides some, but not all, of the functionality that the
Administration Tools provides. In addition, the MC provides extended functionality not
available in the Administration Tools, such as a graphical view of your Vertica database
and detailed monitoring charts and graphs.
Most of the information you need to use MC is available on the MC interface, as seen in
the following two screenshots. For installation instructions, see Installing and
Configuring Management Console in the Installation Guide. For an introduction to MC
functionality, architecture, and security, see Management Console in Vertica Concepts.

Running the Administration Tools
A man page is available for convenient access to Administration Tools details. If you are
running as the dbadmin user, type man admintools. If you are running as a different

user, type man -M /opt/vertica/man admintools. If possible, always run the
Administration Tools using the database administrator account (dbadmin) on the
administration host.
The Administration Tools interface responds to mouse clicks in some terminal windows,
particularly local Linux windows, but you might find that it responds only to keystrokes.
For a quick reference to keystrokes, see Using Keystrokes in the Administration Tools
Interface in this guide.
When you run Administration Tools, the Main Menu dialog box appears with a dark blue
background and a title on top. The screen captures used in this documentation set are
cropped down to the dialog box itself, as shown in the following screenshot.
First Time Only
The first time you log in as the database administrator and run the Administration Tools,
complete the following steps.
1. In the EULA (end-user license agreement) window, type accept to proceed. A
window displays, requesting the location of the license key file you downloaded
from the HPE Web site. The default path is /tmp/vlicense.dat.
2. Type the absolute path to your license key (for example, /tmp/vlicense.dat) and
click OK.
3. To return to the command line, select Exit and click OK.
Using Keystrokes in the Administration Tools Interface
The following table is a quick reference to keystroke usage in the Administration Tools
interface. See Using the Administration Tools in the Administrator’s Guide for full
details.
Return Run selected command.

Tab Cycle between OK, Cancel, Help, and menu.
Up/Down Arrow Move cursor up and down in menu, window, or help file.
Space Select item in list.
Character Select corresponding command from menu.
Introducing the VMart Example Database
Vertica ships with a sample multi-schema database called the VMart Example
Database, which represents a database that might be used by a large supermarket
(VMart) to access information about its products, customers, employees, and online and
physical stores. Using this example, you can create, run, optimize, and test a multi-
schema database.
The VMart database contains the following schema:
l public (automatically created in any newly created Vertica database)
l store
l online_Sales
VMart Database Location and Scripts
If you installed Vertica from the RPM package, the VMart schema is installed in the
/opt/vertica/examples/VMart_Schema directory. This folder contains the following
script files that you can use to get started quickly. Use the scripts as templates for your
own applications.
Script/file name Description
vmart_count_data.sql SQL script that counts
rows of all example
database tables, which
you can use to verify load.
vmart_define_schema.sql SQL script that defines the
logical schema for each

table and referential
integrity constraints.
vmart_gen.cpp Data generator source
code (C++).
vmart_gen Data generator
executable file.
vmart_load_data.sql SQL script that loads the
generated sample data to
the corresponding tables
using COPY DIRECT.
vmart_ queries.sql SQL script that contains
concatenated sample
queries for use as a
training set for the
Database Designer.
vmart_query_##.sql SQL scripts that contain
individual queries; for
example, vmart_query_
01 through vmart_
query_09.sql
vmart_schema_drop.sql SQL script that drops all
example database tables.
For more information about the schema, tables, and queries included with the VMart
example database, see the Appendix.
Installing and Connecting to the VMart Example
Database
Follow the steps in this section to create the fully functioning, multi-schema VMart
example database that you’ll use to run sample queries. The number of example
databases you create within a single Vertica installation is limited only by the disk space
available on your system; however, Hewlett Packard Enterprise strongly recommends
that you start only one example database at a time to avoid unpredictable results.

Vertica provides two options to install the example database:
l A quick installation that lets you create the example database and start using it
immediately. See Quick Installation Using a Script in this guide for details. Use this
method to bypass the schema and table creation processes and start querying
immediately.
l An advanced-but-simple example database installation using the Administration
Tools interface. See Advanced Installation in this guide for details. Use this method
to better understand the database creation process and practice creating schema and
tables, and loading data.
Note: Both installation methods create a database named VMart. If you try both
installation methods, you will either need to drop the VMart database you created
(see Restoring the Status of Your Host in this guide) or create the subsequent
database with a new name. However, Hewlett Packard Enterprise strongly
recommends that you start only one example database at a time to avoid
unpredictable results
This tutorial uses Vertica-provided queries, but you can follow the same set of
procedures later, when you create your own design and use your own queries file.
After you install the VMart database, the database has started. Connect to it using the
steps in Step 3: Connecting to the Database.
Quick Installation Using a Script
The script you need to perform a quick installation is located in /opt/vertica/sbin
and is called install_example. This script creates a database on the default port
(5433), generates data, creates the schema and a default superprojection, and loads the
data. The folder also contains a delete_example script, which stops and drops the
database.
1. In a terminal window, log in as the database administrator.
$ su dbadmin
Password: (your password)
2. Change to the /examples directory.
$ cd /opt/vertica/examples

3. Run the install script:
$ /opt/vertica/sbin/install_example VMart
After installation, you should see the following:
[dbadmin@localhost examples]$ /opt/vertica/sbin/install_example VMart
Installing VMart example example database
Mon Jul 22 06:57:40 PDT 2013
Creating Database
Completed
Generating Data. This may take a few minutes.
Completed
Creating schema
Completed
Loading 5 million rows of data. Please stand by.
Completed
Removing generated data files
Example data
The example database log files, ExampleInstall.txt and ExampleDelete.txt, are
written to /opt/vertica/examples/log.
To start using your database, continue to Connecting to the Database in this guide. To
drop the example database, see Restoring the Status of Your Host in this guide.
Advanced Installation
To perform an advanced-but-simple installation, set up the VMart example database
environment and then create the database using the Administration Tools or
Management Console.
Note: If you installed the VMart database using the quick installation method, you
cannot complete the following steps because the database has already been
created.
To try the advanced installation, drop the example database (see Restoring the
Status of Your Host on this guide) and perform the advanced Installation, or create a
new example database with a different name. However, Hewlett Packard Enterprise
strongly recommends that you install only one example database at a time to avoid
unpredictable results.
The advanced installation requires the following steps:

l Step 1: Setting Up the Example Environment
l Creating the Example Database Using the Administration Tools
l Step 3: Connecting to the Database
l Step 4: Defining the Database Schema
l Step 5: Loading Data
Step 1: Setting Up the Example Environment
1. Stop all databases running on the same host on which you plan to install your
example database.
If you are unsure if other databases are running, run the Administration Tools and
select View Cluster State. The State column should show DOWN values on pre-
existing databases.
If databases are running, click Stop Database in the Main Menu of the
Administration Tools interface and click OK.
2. In a terminal window, log in as the database administrator:
$ su dbadmin
Password:
3. Change to the /VMart_Schema directory.
$ cd /opt/vertica/examples/VMart_Schema
Do not change directories while following this tutorial. Some steps depend on being
in a specific directory.
4. Run the sample data generator.
$ ./vmart_gen
5. Let the program run with the default parameters, which you can review in the

README file.
Using default parameters
datadirectory = ./
numfiles = 1
seed = 2
null = ' '
timefile = Time.txt
numfactsalesrows = 5000000
numfactorderrows = 300000
numprodkeys = 60000
numstorekeys = 250
numpromokeys = 1000
numvendkeys = 50
numcustkeys = 50000
numempkeys = 10000
numwarehousekeys = 100
numshippingkeys = 100
numonlinepagekeys = 1000
numcallcenterkeys = 200
numfactonlinesalesrows = 5000000
numinventoryfactrows = 300000
gen_load_script = false
Data Generated successfully !
Using default parameters
datadirectory = ./
numfiles = 1
seed = 2
null = ' '
timefile = Time.txt
numfactsalesrows = 5000000
numfactorderrows = 300000
numprodkeys = 60000
numstorekeys = 250
numpromokeys = 1000
numvendkeys = 50
numcustkeys = 50000
numempkeys = 10000
numwarehousekeys = 100
numshippingkeys = 100
numonlinepagekeys = 1000
numcallcenterkeys = 200
numfactonlinesalesrows = 5000000
numinventoryfactrows = 300000
gen_load_script = false
Data Generated successfully !
6. If the vmart_gen executable does not work correctly, recompile it as follows, and
run the sample data generator script again.
$ g++ vmart_gen.cpp -o vmart_gen
$ chmod +x vmart_gen
$ ./vmart_gen

Step 2: Creating the Example Database
To create the example database: use the Administration Tools or Management Console,
as described in this section.
Creating the Example Database Using the Administration Tools
In this procedure, you create the example database using the Administration Tools. To
use the Management Console, go to the next section.
Note: If you have not used Administration Tools before, see Running the
Administration Tools in this guide.
1. Run the Administration Tools.
$ /opt/vertica/bin/admintools
or simply type admintools
2. From the Administration Tools Main Menu, click Configuration Menu and click OK.
3. Click Create Database and click OK.
4. Name the database VMart and click OK.
5. Click OK to bypass the password and click Yes to confirm.
There is no need for a database administrator password in this tutorial. When you
create a production database, however, always specify an administrator password.
Otherwise, the database is permanently set to trust authentication (no passwords).
6. Select the hosts you want to include from your Vertica cluster and click OK.

This example creates the database on a one-host cluster. Hewlett Packard
Enterprise recommends a minimum of three hosts in the cluster. If you are using the
Vertica Community Edition, you are limited to three nodes.
7. Click OK to select the default paths for the data and catalog directories.
n Catalog and data paths must contain only alphanumeric characters and cannot
have leading space characters. Failure to comply with these restrictions could
result in database creation failure.
n When you create a production database, you’ll likely specify other locations than
the default. See Prepare Disk Storage Locations in the Administrator’s Guide for
more information.
8. Since this tutorial uses a one-host cluster, a K-safety warning appears. Click OK.
9. Click Yes to create the database.

During database creation, Vertica automatically creates a set of node definitions
based on the database name and the names of the hosts you selected and returns a
success message.
10. Click OK to close the Database VMart created successfully message.
Creating the Example Database Using Management Console
In this procedure, you create the example database using Management Console. To
use the Administration Tools, follow the steps in the preceding section.
Note: To use Management Console, the console should already be installed and
you should be familiar with its concepts and layout. See Using Management
Console in this guide for a brief overview, or for detailed information, see
Management Console in Vertica Concepts and Installing and Configuring
Management Console in Installing Vertica.
1. Connect to Management Console and log in.
2. On the Home page, under Manage Information, click Existing Infrastructure to go
to the Databases and Clusters page.

3. Click to select the appropriate existing cluster and click Create Database.

4. Follow the on-screen wizard, which prompts you to provide the following
information:
n Database name, which must be between 3–25 characters, starting with a letter,
and followed by any combination of letters, numbers, or underscores.
n (Optional) database administrator password for the database you want to create
and connect to.
n IP address of a node in your database cluster, typically the IP address of the
administration host.
5. Click Next.
Step 3: Connecting to the Database
Regardless of the installation method you used, follow these steps to connect to the
database.

1. As dbadmin, run the Administration Tools.
or simply type admintools.
2. If you are already in the Administration Tools, navigate to the Main Menu page.
3. Select Connect to Database, click OK.
To configure and load data into the VMart database, complete the following steps:
n Step 4: Defining the Database Schema
n Step 5: Loading Data
If you installed the VMart database using the Quick Installation method, the schema,
tables, and data are already defined. You can choose to drop the example database
(see Restoring the Status of Your Host in this guide) and perform the Advanced
Installation, or continue straight to Querying Your Data in this guide.
Step 4: Defining the Database Schema
The VMart database installs with sample scripts with SQL commands that are intended
to represent queries that might be used in a real business. The vmart_define_
schema.sql script runs a script that defines the VMart schema and creates tables. You
must run this script before you load data into the VMart database.
This script performs the following tasks:
l Defines two schemas in the VMart database schema: online_sales and store.
l Defines tables in both schemas.
l Defines constraints on those tables.

Vmart=> i vmart_define_schema.sql
CREATE SCHEMA
CREATE SCHEMA
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
CREATE TABLE
CREATE TABLE
ALTER TABLE
Step 5: Loading Data
Now that you have created the schemas and tables, you can load data into a table by
running the vmart_load_data.sql script. This script loads data from the 15 .tbl text
files in opt/vertica/examples/VMart_Schema into the tables that vmart_design_
schema.sql created.
It might take several minutes to load the data on a typical hardware cluster. Check the
load status by monitoring the vertica.log file, as described in Monitoring Log Files in
the Administrator’s Guide.
VMart=> i vmart_load_data.sql
Rows Loaded
-------------
1826
(1 row)
Rows Loaded
-------------
60000
(1 row)
Rows Loaded
-------------
250
(1 row)
Rows Loaded
-------------
1000
(1 row)

Rows Loaded
-------------
50
(1 row)
Rows Loaded
-------------
50000
(1 row)
Rows Loaded
-------------
10000
(1 row)
Rows Loaded
-------------
100
(1 row)
Rows Loaded
-------------
100
(1 row)
Rows Loaded
-------------
1000
(1 row)
Rows Loaded
-------------
200
(1 row)
Rows Loaded
-------------
5000000
(1 row)
Rows Loaded
-------------
300000
(1 row)
VMart=>
Querying Data
The VMart database installs with sample scripts that contain SQL commands that
represent queries that might be used in a real business. Use basic SQL commands to
query the database, or try out the following command. Once you’re comfortable running
the example queries, you might want to write your own.
Note: The data that your queries return might differ from the example output shown
in this guide because the sample data generator is random.

Type the following SQL command to return the values for five products with the lowest
fat content in the Dairy department. The command selects the fat content from Dairy
department products in the product_dimention table in the public schema, orders
them from low to high and limits the output to the first five (the five lowest fat contents).
VMart => SELECT fat_content
FROM ( SELECT DISTINCT fat_content
FROM product_dimension
WHERE department_description
IN ('Dairy') ) AS food
ORDER BY fat_content
LIMIT 5;
Your results will be similar to the following.
fat_content
-------------
80
81
82
83
84
(5 rows)
The preceding example is from the vmart_query_01.sql file. You can execute more
sample queries using the scripts that installed with the VMart database or write your
own. For a list of the sample queries supplied with Vertica, see the Appendix.
Backing Up and Restoring the Database
Vertica supplies a comprehensive utility, the vbr Python script, that lets you back up
and restore a full database, as well as create backups of specific schema or tables. The
vbr utility creates backup directories during its initial execution; subsequently running
the utility creates subdirectories.
The following information is intended to introduce the backup and restore functions. For
more detailed information, see Backing Up and Restoring the Database in the
Administrator’s Guide.
Backing Up the Database
Use vbr to save your data to a variety of locations:
l A local directory on the nodes in a cluster
l One or more hosts outside of the cluster

l A different Vertica cluster (effectively cloning your database)
Note: Creating a database backup on a different cluster does not provide disaster
recovery. The cloned database you create with vbr is entirely separate from the
original, and is not kept synchronized with the database from which it is cloned.
When to Back up the Database
In addition to any guidelines established by your organization, Hewlett Packard
Enterprise recommends that you back up your database:
l Before you upgrade Vertica to another release.
l Before you drop a partition.
l After you load a large volume of data.
l If the epoch in the latest backup is earlier than the current ancient history mark
(AHM).
l Before and after you add, remove, or replace nodes in your database cluster.
l After recovering a cluster from a crash.
Note: When you restore a database backup, you must restore to a cluster that is
identical to the one where you created the backup. For this reason, always create a
new backup after adding, removing, or replacing nodes.
Ideally, create regular backups of your full database. You can run the Vertica vbr utility
from a cron job or other task scheduler.
Creating the Backup Configuration File
The vbr utility uses a configuration file for the information required to back up and
restore a full- or object-level backup. The configuration file defines where the database
backup is saved, the temporary directories it uses, and which nodes, schema, and/or
tables in the database are to be backed up. You cannot run vbr without a configuration
file, and no default file exists.
To invoke the script to set up a configuration file, enter this command:

$ vbr --setupconfig
The script prompts you to answer the following questions regarding the configuration
file. Type Enter to accept the default value in parentheses. See VBR Configuration File
Reference in the Administrator’s Guide for information about specific questions.
[dbadmin@localhost ~]$ /opt/vertica/bin/vbr --setupconfig
Snapshot name (backup_snapshot): fullbak1
Number of restore points (1): 3
Specify objects (no default):
Object restore mode (coexist, createOrReplace or create)
(createOrReplace):
Vertica user name (dbadmin):
Save password to avoid runtime prompt? (n) [y/n]: y
Password to save in vbr config file (no default):
Node v_vmart_node0001
Backup host name (no default): 194.66.82.11
Backup directory (no default): /home/dbadmin/backups
Config file name (fullbak1.ini):
Password file name (no default value) (no default):pwdfile
Change advanced settings? (n) [y/n]: n
Saved vbr configuration to fullbak1.ini.
Saved vbr database password to pwdfile.ini.
After you answer the required questions, vbr generates a configuration file with the
information you supplied. Use the Config file name you specified when you run the
--task backup or other commands. The vbr utility uses the configuration file contents
for both backup and restore tasks.
Creating Full and Incremental Backups
Before you create a database backup, ensure the following:
l Your database is running.
l All of the backup hosts are up and available.
l The backup location host has sufficient disk space to store the backups.

l The user who starts the utility has write access to the target directories on the host
backup location.
Run the vbr script from a terminal using the database administrator account from an
initiator node in your database cluster. You cannot run the utility as root.
Use the --task backup and --config-file filename directives as shown in this
example.
[release@qco55srv01:/scratch_b/qa/vertica/QA/VT_Scenario 0]$ vbr -t backup --config $FULLBAK_CONFIG
Starting backup of database VTDB.
Participating nodes: node01, node02, node03, node04.
Snapshotting database.
Snapshot complete.
Approximate bytes to copy: 2315056043 of 2356089422 total.
[==================================================] 100%
Copying backup metadata.
Finalizing backup.
Backup complete!
By default, there is no screen output other than the progress bar.
If you do not specify a configuration file, the vbr utility searches for one at this location:
/opt/vertica/config/vbr.ini
If the utility does not locate the configuration you specify, it searches for one at
opt/vertica/config/vbr.ini. If no file exists there, it fails with an error.
The first time you run the vbr utility, it performs a full backup; subsequent runs with the
same configuration file create an incremental backup. When creating incremental
backups, the utility copies new storage containers, which can include data that existed
the last time you backed up the database, along with new and changed data since then.
By default, vbr saves one archive backup, unless you set the restorePointLimit
parameter value in the configuration file to a value greater than 1.
Restoring the Database
To restore a full database backup, ensure that:
l The database is down.
l All of the backup hosts are up and available.
l The backup directory exists and contains the backups from which to restore.

To begin a full database backup restore, log in using the database administrator’s
account. You cannot run the utility as root. For detailed instructions on restoring a
database, refer to Recovering the Database.
Using Database Designer to Create a
Comprehensive Design
The Vertica Database Designer:
l Analyzes your logical schema, sample data, and, optionally, your sample queries.
l Creates a physical schema design (a set of projections) that can be deployed
automatically or manually.
l Can be used by anyone without specialized database knowledge.
l Can be run and rerun any time for additional optimization without stopping the
database.
l Uses strategies to provide optimal query performance and data compression.
Use Database Designer to create a comprehensive design, which allows you to create
new projections for all tables in your database.
You can also use Database Designer to create an incremental design, which creates
projections for all tables referenced in the queries you supply. For more information, see
Incremental Design in the Administrator’s Guide.
You can create a comprehensive design with Database Designer using Management
Console or through Administration Tools. You can also choose to run Database
Designer programmatically (See About Running Database Designer Programmatically).
This section shows you how to:
l Running Database Designer with Management Console
l Run Database Designer with Administration Tools
Running Database Designer with Management Console
In this tutorial, you'll create a comprehensive design with Database Designer through
the Management Console interface. If, in the future, you have a query that you want to
optimize, you can create an enhanced (incremental) design with additional projections.

You can tune these projections specifically for the query you provide. See
Comprehensive Design in the Administrator's Guide for more information.
Note: To run Database Designer outside Administration Tools, you must be a
dbadmin user. If you are not a dbadmin user, you must have the DBDUSER role
assigned to you and own the tables for which you are designing projections.
You can choose to create the design manually or use the wizard. To create a design
manually, see Creating a Design Manually in the Administrator's Guide.
Set your browser so that it does not cache pages. If a browser caches pages, you may
not be able to see the new design added.
Follow these steps to use the wizard to create the comprehensive design in
Management Console:
1. Log in to Management Console.
2. Verify that your database is up and running.
3. Choose the database for which you want to create the design. You can find the
database under the Recent Databases section or by clicking Existing
Infrastructure to reach the Databases and Clusters page.
The database overview page opens:

4. At the bottom of the screen, click the Design button.
5. In the New Design dialog box, enter the design name.
6. Click Wizard to continue.
7. Create an initial design. For Design Type, select Comprehensive and click Next.
8. In the Optimization Objective window, select Balance Load and Performance to
create a design that is balanced between database size and query performance.
Click Next.

9. Select the schemas. Because the VMart design is a multi-schema database, select
all three schemas (public, store, and online_sales) for your design in the Select
Sample Data window. Click Next.
If you include a schema that contains tables without data, the design could be
suboptimal. You can choose to continue, but Verticarecommends that you deselect
the schemas that contain empty tables before you proceed.
10. Choose the K-safety value for your design.The K-Safety value determines the
number of buddy projections you want database designer to create.
11. Choose Analyze Correlations Mode. Analyze Correlations Mode determines if
Database Designer analyzes and considers column correlations when creating the
design. For more information, see DESIGNER_SET_ANALYZE_
CORRELATIONS_MODE.
Click Next.
12. Submit query files to Database Designer in one of two ways:
a. Supply your own query files by selecting the Browse button.
b. Click Use Query Repository, which submits recently executed queries from the
QUERY_REQUESTS system table.

Click Next.
13. In the Execution Options window, select all the options you want. You can select
all three options or fewer.
The three options are:
n Analyze statistics: Select this option to run statistics automatically after design
deploy to help Database Designer make more optimal decisions for its proposed
design.
n Auto-build: Select this option to run Database Designer as soon as you complete
the wizard. This option only builds the proposed design.
n Auto-deploy: Select this option for auto-build designs that you want to deploy
automatically.
14. Click Submit Design.
The Database Designer page opens:
n If you chose to automatically deploy your design, Database Designer executes in
the background.
n If you did not select the Auto-build or Auto-deploy options, you can click Build
Design or Deploy Design on the Database Designer page.

15. In the My Designs pane, view the status of your design:
n When the deployment completes, the My Design pane shows Design Deployed.
n The event history window shows the details of the design build and deployment.
To run Database Designer with Administration Tools, see Run Database Designer with
Administration Tools in this guide.
Run Database Designer with Administration Tools
In this procedure, you create a comprehensive design with Database Designer using
the Administration Tools interface. If, in the future, you have a query that you want to
optimize, you can create an enhanced (incremental) design with additional projections.
You can tune these projections specifically for the query you provide. See Incremental
Design in the Administrator’s Guide for more information.
Follow these steps to create the comprehensive design using Database Designer in
Administration Tools:
1. If you are not in Administration Tools, exit the vsql session and access
Administration Tools:
n Type q to exit vsql.
n Type admintools to access the Administration Tools Main Menu.
2. Start the database for which you want to create a design.
3. From the Main Menu, click Configuration Menu and then click OK.

4. From the Configuration Menu, click Run Database Designer and then click OK.
5. When the Select a database for design dialog box opens, select VMart and then
click OK.
If you are prompted to enter the password for the database, click OK to bypass the
message. Because no password was assigned when you installed the VMart
database, you do not need to enter one now.
6. Click OK to accept the default directory for storing Database Designer output and
log files.
7. In the Database Designer window, enter a name for the design, for example,
vmart_design, and click OK. Design names can contain only alphanumeric
characters or underscores. No other special characters are allowed.

8. Create a complete initial design. In the Design Type window, click
Comprehensive and click OK.
9. Select the schemas. Because the VMart design is a multi-schema database, you
can select all three schemas (online_sales, public, and store) for your design. Click
OK.
If you include a schema that contains tables without data, the Administration Tools
notifies you that designing for tables without data could be suboptimal. You can

choose to continue, but Hewlett Packard Enterprise recommends that you deselect
the schemas that contain empty tables before you proceed.
10. In the Design Options window, accept all three options and click OK.
The three options are:
n Optimize with queries: Supplying the Database Designer with queries is
especially important if you want to optimize the database design for query
performance. Hewlett Packard Enterprise recommends that you limit the design
input to 100 queries.
n Update statistics: Accurate statistics help the Database Designer choose the
best strategy for data compression. If you select this option, the database statistics
are updated to maximize design quality.
n Deploy design: The new design deploys automatically. During deployment, new
projections are added, some existing projections retained, and any necessary
existing projections removed. Any new projections are refreshed to populate them
with data.
11. Because you selected the Optimize with queries option, you must enter the full
path to the file containing the queries that will be run on your database. In this
example, it is:
/opt/vertica/examples/VMart_Schema/vmart_queries.sql

The queries in the query file must be delimited with a semicolon (;).
12. Choose the K-safety value you want and click OK. The design K-Safety determines
the number of buddy projections you want database designer to create.
If you create a comprehensive design on a single node, you are not prompted to
enter a K-safety value.
13. In the Optimization Objective window, select Balanced query/load performance
to create a design that is balanced between database size and query performance.
Click OK.
14. When the informational message displays, click Proceed.
Database Designer automatically performs these actions:
n Sets up the design session.
n Examines table data.

n Loads queries from the query file you provided (in this example,
/opt/vertica/examples/VMart_Schema/vmart_queries.sql).
n Creates the design.
Deploys the design or saves a SQL file containing the commands to create the
design, based on your selections in the Desgin Options window.
Depending on system resources, the design process could take several minutes.
You should allow this process to complete uninterrupted. If you must cancel the
session, use Ctrl+C.
15. When Database Designer finishes, press Enter to return to the Administration Tools
menu. Examine the steps taken to create the design. The files are in the directory
you specified to store the output and log files. In this example, that directory is
/opt/vertica/examples/VMart_Schema. For more information about the script
files, see About Database Designer, in the Administrator's Guide.
For additional information about managing your designs, see Creating a Database
Design in the Administrator’s Guide.
Restoring the Status of Your Host
When you finish the tutorial, you can restore your host machines to their original state.
Use the following instructions to clean up your host and start over from scratch.
Stopping and Dropping the Database
Follow these steps to stop and/or drop your database. A database must be stopped
before it can be dropped.

1. If connected to the database, disconnect by typing q.
2. In the Administration Tools Main Menu dialog box, click Stop Database and click
OK.
3. In the Select database to stop window, select the database you want to stop and
click OK.
4. After stopping the database, click Configuration Menu and click OK.
5. Click Drop Database and click OK.
6. In the Select database to drop window, select the database you want to drop and
click OK.
7. Click Yes to confirm.
8. In the next window type yes (lowercase) to confirm and click OK.
Alternatively, use the delete_example script, which stops and drops the database:
1. If connected to the database, disconnect by typing q.
2. In the Administration Tools Main Menu dialog box, select Exit.
3. Log in as the database administrator.
4. Change to the /examples directory.
$ cd /opt/vertica/examples
5. Run the delete_example script.
$ /opt/vertica/sbin/delete_example Vmart
Perform the steps inUninstalling Vertica in Installing Vertica.
Optional Steps
You can also choose to:

l Remove the dbadmin account on all cluster hosts.
l Remove any example database directories you created.
Changing the GUI Appearance
The appearance of the Graphical User Interface (GUI) depends on the color and font
settings used by your terminal window. The screen captures in this document were
made using the default color and font settings in a PuTTY terminal application running
on a Windows platform.
Note: If you are using a remote terminal application, such as PuTTY or a Cygwin
bash shell, make sure your window is at least 81 characters wide and 23 characters
high.
If you are using PuTTY, take these steps to make the Administration Tools look like the
screen captures in this document.
1. In a PuTTY window, right-click the title area and select Change Settings.
2. Create or load a saved session.
3. In the Category dialog, click Window > Appearance.
4. In the Font settings, click the Change… button.
5. Select Font: Courier New, Regular Size: 10.
6. Click Apply.
Repeat these steps for each existing session that you use to run the Administration
Tools.
You can also change the translation to support UTF-8.
1. In a PuTTY window, right-click the title area and select Change Settings.
3. In the Category dialog, click Window > Translation.

4. In the Received data assumed to be in which character set drop-down menu,
select UTF-8.
5. Click Apply.
Appendix: VMart Example Database Schema,
Tables, and Scripts
The Appendix provides detailed information about the VMart example database’s
schema, tables, and scripts.
The VMart example database contains three different schemas:
l public
l store
l online_sales
The term “schema” has several related meanings in Vertica:
l In SQL statements, a schema refers to named namespace for a logical schema.
l Logical schema refers to a set of tables and constraints.
l Physical schema refers to a set of projections.
Each schema contains tables that are created and loaded during database installation.
See the schema maps for a list of tables and their contents:
l public Schema Map
l store Schema Map
l online_sales Schema Map
The VMart database installs with sample scripts that contain SQL commands that
represent queries that might be used in a real business. The sample scripts are
available in the Sample Scripts section of this Appendix. Once you’re comfortable
running the example queries, you might want to write your own.

Tables
The three schemas in the VMart database include the following tables:
public Schema store Schema online_sales
Schema
inventory_fact store_orders_
fact
online_sales_fact
customer_
dimension
store_sales_
fact
call_center_
dimension
date_dimension store_
dimension
online_page_
dimension
employee_
dimension
product_
dimension
promotion_
dimension
shipping_
dimension
vendor_
dimension
warehouse_
dimension
public Schema Map
The public schema is a snowflake schema. The following graphic illustrates the
public schema and its relationships with tables in the online_sales and store
schemas.

inventory_fact
This table contains information about each product in inventory.
Column Name Data Type NULLs

date_key
INTEGER No
product_key
INTEGER No
product_version
INTEGER No
warehouse_key
INTEGER No
qty_in_stock
INTEGER No
customer_dimension
This table contains information about all the retail chain’s customers.
customer_key
INTEGER No
customer_type
VARCHAR(16) Yes
customer_name
VARCHAR(256) Yes
customer_gender
VARCHAR(8) Yes
title
VARCHAR(8) Yes
household_id
INTEGER Yes
customer_address
VARCHAR(256) Yes
customer_city
VARCHAR(64) Yes
customer_state
CHAR(2) Yes
customer_region
VARCHAR(64) Yes
marital_status
VARCHAR(32) Yes
customer_age
INTEGER Yes
number_of_children
INTEGER Yes
annual_income
INTEGER Yes

occupation
VARCHAR(64) Yes
largest_bill_amount
INTEGER Yes
store_membership_card
INTEGER Yes
customer_since
DATE Yes
deal_stage
VARCHAR(32) Yes
deal_size
INTEGER Yes
last_deal_update
DATE Yes
date_dimension
This table contains information about dates. It is generated from a file containing correct
date/time data.
date_key
INTEGER No
date
DATE Yes
full_date_description
VARCHAR(18) Yes
day_of_week
VARCHAR(9) Yes
day_number_in_calendar_month
INTEGER Yes
day_number_in_calendar_year
INTEGER Yes
day_number_in_fiscal_month
INTEGER Yes
day_number_in_fiscal_year
INTEGER Yes
last_day_in_week_indicator
INTEGER Yes
last_day_in_month_indicator
INTEGER Yes
calendar_week_number_in_year
INTEGER Yes
calendar_month_name
VARCHAR(9) Yes

calendar_month_number_in_year
INTEGER Yes
calendar_year_month
CHAR(7) Yes
calendar_quarter
INTEGER Yes
calendar_year_quarter
CHAR(7) Yes
calendar_half_year
INTEGER Yes
calendar_year
INTEGER Yes
holiday_indicator
VARCHAR(10) Yes
weekday_indicator
CHAR(7) Yes
selling_season
VARCHAR(32) Yes
employee_dimension
This table contains information about all the people who work for the retail chain.
employee_key
INTEGER No
employee_gender
VARCHAR(8) Yes
courtesy_title
VARCHAR(8) Yes
employee_first_name
VARCHAR(64) Yes
employee_middle_initial
VARCHAR(8) Yes
employee_last_name
VARCHAR(64) Yes
employee_age
INTEGER Yes
hire_date
DATE Yes
employee_street_address
VARCHAR(256) Yes
employee_city
VARCHAR(64) Yes

employee_state
CHAR(2) Yes
employee_region
CHAR(32) Yes
job_title
VARCHAR(64) Yes
reports_to
INTEGER Yes
salaried_flag
INTEGER Yes
annual_salary
INTEGER Yes
hourly_rate
FLOAT Yes
vacation_days
INTEGER Yes
product_dimension
This table describes all products sold by the department store chain.
product_key
INTEGER No
product_version
INTEGER No
product_description
VARCHAR(128) Yes
sku_number
CHAR(32) Yes
category_description
CHAR(32) Yes
department_description
CHAR(32) Yes
package_type_description
CHAR(32) Yes
package_size
CHAR(32) Yes
fat_content
INTEGER Yes
diet_type
CHAR(32) Yes
weight
INTEGER Yes

weight_units_of_measure
CHAR(32) Yes
shelf_width
INTEGER Yes
shelf_height
INTEGER Yes
shelf_depth
INTEGER Yes
product_price
INTEGER Yes
product_cost
INTEGER Yes
lowest_competitor_price
INTEGER Yes
highest_competitor_price
INTEGER Yes
average_competitor_price
INTEGER Yes
discontinued_flag
INTEGER Yes
promotion_dimension
This table describes every promotion ever done by the retail chain.
promotion_key
INTEGER No
promotion_name
VARCHAR(128) Yes
price_reduction_type
VARCHAR(32) Yes
promotion_media_type
VARCHAR(32) Yes
ad_type
VARCHAR(32) Yes
display_type
VARCHAR(32) Yes
coupon_type
VARCHAR(32) Yes
ad_media_name
VARCHAR(32) Yes
display_provider
VARCHAR(128) Yes

promotion_cost
INTEGER Yes
promotion_begin_date
DATE Yes
promotion_end_date
DATE Yes
shipping_dimension
This table contains information about shipping companies that the retail chain uses.
shipping_key
INTEGER No
ship_type
CHAR(30) Yes
ship_mode
CHAR(10) Yes
ship_carrier
CHAR(20) Yes
vendor_dimension
This table contains information about each vendor that provides products sold through
the retail chain.
vendor_key
INTEGER No
vendor_name
VARCHAR(64) Yes
vendor_address
VARCHAR(64) Yes
vendor_city
VARCHAR(64) Yes
vendor_state
CHAR(2) Yes
vendor_region
VARCHAR(32) Yes
deal_size
INTEGER Yes
last_deal_update
DATE Yes
warehouse_dimension
This table provides information about each of the chain’s warehouses.

warehouse_key
INTEGER No
warehouse_name
VARCHAR(20) Yes
warehouse_address
VARCHAR(256) Yes
warehouse_city
VARCHAR(60) Yes
warehouse_state
CHAR(2) Yes
warehouse_region
VARCHAR(32) Yes
store Schema Map
The store schema is a snowflake schema that contains information about the retail
chain’s bricks-and-mortar stores. The following graphic illustrates the store schema
and its relationship with tables in the public schema.

store_orders_fact
This table contains information about all orders made at the company’s brick-and-mortar
stores.

product_key
INTEGER No
product_version
INTEGER No
store_key
INTEGER No
vendor_key
INTEGER No
employee_key
INTEGER No
order_number
INTEGER No
date_ordered
DATE Yes
date_shipped
DATE Yes
expected_delivery_date
DATE Yes
date_delivered
DATE Yes
quantity_ordered
INTEGER Yes
quantity_delivered
INTEGER Yes
shipper_name
VARCHAR(32) Yes
unit_price
INTEGER Yes
shipping_cost
INTEGER Yes
total_order_cost
INTEGER Yes
quantity_in_stock
INTEGER Yes
reorder_level
INTEGER Yes
overstock_ceiling
INTEGER Yes
store_sales_fact
This table contains information about all sales made at the company’s brick-and-mortar
stores.

date_key
INTEGER No
product_key
INTEGER No
product_version
INTEGER No
store_key
INTEGER No
promotion_key
INTEGER No
customer_key
INTEGER No
employee_key
INTEGER No
pos_transaction_number
INTEGER No
sales_quantity
INTEGER Yes
sales_dollar_amount
INTEGER Yes
cost_dollar_amount
INTEGER Yes
gross_profit_dollar_amount
INTEGER Yes
transaction_type
VARCHAR(16) Yes
transaction_time
TIME Yes
tender_type
VARCHAR(8) Yes
store_dimension
This table contains information about each brick-and-mortar store within the retail chain.
store_key
INTEGER No
store_name
VARCHAR(64) Yes
store_number
INTEGER Yes
store_address
VARCHAR(256) Yes

store_city
VARCHAR(64) Yes
store_state
CHAR(2) Yes
store_region
VARCHAR(64) Yes
floor_plan_type
VARCHAR(32) Yes
photo_processing_type
VARCHAR(32) Yes
financial_service_type
VARCHAR(32) Yes
selling_square_footage
INTEGER Yes
total_square_footage
INTEGER Yes
first_open_date
DATE Yes
last_remodel_date
DATE Yes
number_of_employees
INTEGER Yes
annual_shrinkage
INTEGER Yes
foot_traffic
INTEGER Yes
monthly_rent_cost
INTEGER Yes
online_sales Schema Map
The online_sales schema is a snowflake schema that contains information about the
retail chains. The following graphic illustrates the online_sales schema and its
relationship with tables in the public schema.

online_sales_fact
This table describes all the items purchased through the online store front.
sale_date_key
INTEGER No
ship_date_key
INTEGER No
product_key
INTEGER No
product_version
INTEGER No
customer_key
INTEGER No
call_center_key
INTEGER No
online_page_key
INTEGER No
shipping_key
INTEGER No
warehouse_key
INTEGER No
promotion_key
INTEGER No
pos_transaction_number
INTEGER No

sales_quantity
INTEGER Yes
sales_dollar_amount
FLOAT Yes
ship_dollar_amount
FLOAT Yes
net_dollar_amount
FLOAT Yes
cost_dollar_amount
FLOAT Yes
gross_profit_dollar_amount
FLOAT Yes
transaction_type
VARCHAR(16) Yes
call_center_dimension
This table describes all the chain’s call centers.
call_center_key
INTEGER No
cc_closed_date
DATE Yes
cc_open_date
DATE Yes
cc_date
VARCHAR(50) Yes
cc_class VARCHAR(50) Yes
cc_employees
INTEGER Yes
cc_hours
CHAR(20) Yes
cc_manager
VARCHAR(40) Yes
cc_address
VARCHAR(256) Yes
cc_city
VARCHAR(64) Yes
cc_state
CHAR(2) Yes
cc_region
VARCHAR(64) Yes

online_page_dimension
This table describes all the pages in the online store front.
online_page_key
INTEGER No
start_date
DATE Yes
end_date
DATE Yes
page_number
INTEGER Yes
page_description
VARCHAR(100) Yes
page_type
VARCHAR(100) Yes
Sample Scripts
You can create your own queries, but the VMart example directory includes sample
query script files to help you get started quickly.
You can find the following sample scripts at this path
/opt/vertica/examples/VMart_Schema.
To run any of the scripts, enter
=> i <script_name>
Alternatively, type the commands from the script file manually.
Note: The data that your queries return might differ from the example output shown
in this guide because the sample data generator is random.
vmart_query_01.sql
-- vmart_query_01.sql
-- FROM clause subquery
-- Return the values for five products with the
-- lowest-fat content in the Dairy department
SELECT fat_content
FROM (
SELECT DISTINCT fat_content
FROM product_dimension
WHERE department_description
IN ('Dairy') ) AS food

ORDER BY fat_content
LIMIT 5;
Output
fat_content
-------------
80
81
82
83
84
(5 rows)
vmart_query_02.sql
-- WHERE clause subquery
-- Asks for all orders placed by stores located in Massachusetts
-- and by vendors located elsewhere before March 1, 2003:
SELECT order_number, date_ordered
FROM store.store_orders_fact orders
WHERE orders.store_key IN (
SELECT store_key
FROM store.store_dimension
WHERE store_state = 'MA')
AND orders.vendor_key NOT IN (
SELECT vendor_key
FROM public.vendor_dimension
WHERE vendor_state = 'MA')
AND date_ordered < '2003-03-01';
Output
order_number | date_ordered
-------------+--------------
53019 | 2003-02-10
222168 | 2003-02-05
160801 | 2003-01-08
106922 | 2003-02-07
246465 | 2003-02-10
234218 | 2003-02-03
263119 | 2003-01-04
73015 | 2003-01-01
233618 | 2003-02-10
85784 | 2003-02-07
146607 | 2003-02-07
296193 | 2003-02-05
55052 | 2003-01-05
144574 | 2003-01-05
117412 | 2003-02-08
276288 | 2003-02-08
185103 | 2003-01-03
282274 | 2003-01-01
245300 | 2003-02-06
143526 | 2003-01-04

59564 | 2003-02-06
...
vmart_query_03.sql
-- noncorrelated subquery
-- Requests female and male customers with the maximum
-- annual income from customers
SELECT customer_name, annual_income
FROM public.customer_dimension
WHERE (customer_gender, annual_income) IN (
SELECT customer_gender, MAX(annual_income)
GROUP BY customer_gender);
Output
customer_name | annual_income
------------------+---------------
James M. McNulty | 999979
Emily G. Vogel | 999998
(2 rows)
vmart_query_04.sql
-- IN predicate
-- Find all products supplied by stores in MA
SELECT DISTINCT s.product_key, p.product_description
FROM store.store_sales_fact s, public.product_dimension p
WHERE s.product_key = p.product_key
AND s.product_version = p.product_version AND s.store_key IN (
SELECT store_key
WHERE store_state = 'MA')
ORDER BY s.product_key;
Output
product_key | product_description
-------------+----------------------------------------
1 | Brand #1 butter
1 | Brand #2 bagels
2 | Brand #3 lamb
2 | Brand #4 brandy
2 | Brand #5 golf clubs
2 | Brand #6 chicken noodle soup
3 | Brand #10 ground beef
3 | Brand #11 vanilla ice cream
3 | Brand #7 canned chicken broth
3 | Brand #8 halibut
3 | Brand #9 camera case

4 | Brand #12 rash ointment
4 | Brand #13 low fat milk
4 | Brand #14 chocolate chip cookies
4 | Brand #15 silver polishing cream
5 | Brand #16 cod
5 | Brand #17 band aids
6 | Brand #18 bananas
6 | Brand #19 starch
6 | Brand #20 vegetable soup
6 | Brand #21 bourbon
...
vmart_query_05.sql
-- EXISTS predicate
-- Get a list of all the orders placed by all stores on
-- January 2, 2003 for the vendors with records in the
-- vendor_dimension table
SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact
WHERE EXISTS (
SELECT 1
FROM public.vendor_dimension
WHERE public.vendor_dimension.vendor_key = store.store_orders_fact.vendor_key)
AND date_ordered = '2003-01-02';
Output
store_key | order_number | date_ordered
-----------+--------------+--------------
98 | 151837 | 2003-01-02
123 | 238372 | 2003-01-02
242 | 263973 | 2003-01-02
150 | 226047 | 2003-01-02
247 | 232273 | 2003-01-02
203 | 171649 | 2003-01-02
129 | 98723 | 2003-01-02
80 | 265660 | 2003-01-02
231 | 271085 | 2003-01-02
149 | 12169 | 2003-01-02
141 | 201153 | 2003-01-02
1 | 23715 | 2003-01-02
156 | 98182 | 2003-01-02
44 | 229465 | 2003-01-02
178 | 141869 | 2003-01-02
134 | 44410 | 2003-01-02
141 | 129839 | 2003-01-02
205 | 54138 | 2003-01-02
113 | 63358 | 2003-01-02
99 | 50142 | 2003-01-02
44 | 131255 | 2003-01-02
...

vmart_query_06.sql
-- EXISTS predicate
-- Orders placed by the vendor who got the best deal
-- on January 4, 2004
SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact ord, public.vendor_dimension vd
WHERE ord.vendor_key = vd.vendor_key
AND vd.deal_size IN (
SELECT MAX(deal_size)
FROM public.vendor_dimension)
AND date_ordered = '2004-01-04';
Output
-----------+--------------+--------------
45 | 202416 | 2004-01-04
24 | 250295 | 2004-01-04
121 | 251417 | 2004-01-04
198 | 75716 | 2004-01-04
166 | 36008 | 2004-01-04
27 | 150241 | 2004-01-04
148 | 182207 | 2004-01-04
9 | 188567 | 2004-01-04
113 | 66017 | 2004-01-04
...
vmart_query_07.sql
-- Multicolumn subquery
-- Which products have the highest cost,
-- grouped by category and department
SELECT product_description, sku_number, department_description
FROM public.product_dimension
WHERE (category_description, department_description, product_cost) IN (
SELECT category_description, department_description,
MAX(product_cost) FROM product_dimension
GROUP BY category_description, department_description);
Output
product_description | sku_number | department_description
---------------------------+-----------------------+---------------------------------
Brand #601 steak | SKU-#601 | Meat
Brand #649 brooms | SKU-#649 | Cleaning supplies
Brand #677 veal | SKU-#677 | Meat
Brand #1371 memory card | SKU-#1371 | Photography
Brand #1761 catfish | SKU-#1761 | Seafood
Brand #1810 frozen pizza | SKU-#1810 | Frozen Goods
Brand #1979 canned peaches | SKU-#1979 | Canned Goods
Brand #2097 apples | SKU-#2097 | Produce
Brand #2287 lens cap | SKU-#2287 | Photography

...
vmart_query_08.sql
-- Using pre-join projections to answer subqueries
-- between online_sales_fact and online_page_dimension
SELECT page_description, page_type, start_date, end_date
FROM online_sales.online_sales_fact f, online_sales.online_page_dimension d
WHERE f.online_page_key = d.online_page_key
AND page_number IN
(SELECT MAX(page_number)
FROM online_sales.online_page_dimension)
AND page_type = 'monthly' AND start_date = '2003-06-02';
Output
page_description | page_type | start_date | end_date
---------------------------+-----------+------------+-----------
Online Page Description #1 | monthly | 2003-06-02 | 2003-06-11
(12 rows)
vmart_query_09.sql
-- Equi join
-- Joins online_sales_fact table and the call_center_dimension
-- table with the ON clause
SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
FROM online_sales.online_sales_fact
INNER JOIN online_sales.call_center_dimension
ON (online_sales.online_sales_fact.call_center_key
= online_sales.call_center_dimension.call_center_key
AND sale_date_key = 156)
ORDER BY sales_dollar_amount DESC;
Output
sales_quantity | sales_dollar_amount | transaction_type | cc_name
----------------+---------------------+------------------+-------------------
7 | 589 | purchase | Central Midwest
8 | 589 | purchase | South Midwest
8 | 589 | purchase | California
1 | 587 | purchase | New England

Vertica Concepts

Vertica Platform
The Vertica Platform provides a unique architecture that allows for efficient processing of
custom SQL queries. The architecture also provides a quick return of data. Vertica
accomplishes this using the following components:
l Columnar data storage
l Advanced compression
l High Availability
l Automatic Database Design
l Massive Parallel Processing
l Application Integration
Vertica Cluster Architecture
In Vertica, the physical architecture is designed to distribute physical storage and to
allow parallel query execution over a potentially large collection of computing resources.
Hybrid Data Store
Vertica stores data on the database in two containers:
l Write Optimized Store (WOS) - stores data in memory without compression or
indexing. You can use INSERT, UPDATE, and COPYto load data into the WOS.
l Read Optimized Store (ROS) - stores data on disk; the data is segmented, sorted, and
compressed for high optimization. You can load data directly into the ROS using the
COPY statement.
The Tuple Mover moves data from the WOS (memory) to the ROS (disk) using the
following:
l moveout - copies data from the WOS to the Tuple Mover and then to the ROS; data is
sorted, encoded, and compressed into column files.

l mergeout - combines smaller ROS containers into larger ones to reduce
fragmentation.
The following shows how you can use COPY to load data into WOS and them move it to
the ROS, or load data directly into the ROS:
Column Storage
Vertica stores data in a column format so it can be queried for best performance.
Compared to row-based storage, column storage reduces disk I/O making it ideal for
read-intensive workloads. VERTICA reads only the columns needed to answer the
query. For example:
=> SELECT avg(price) FROM tickstore WHERE symbol - 'AAPL' and date
= '5/31/13';
For this example query, a column store reads only three columns while a row store
reads all columns:

Data Encoding and Compression
Vertica uses encoding and compression to optimize query performance and save
storage space.
Encoding converts data into a standard format. Vertica uses a number of different
encoding strategies, depending on column data type, table cardinality, and sort order.
Encoding increases performance because there is less disk I/O during query execution.
In addition, you can store more data in less space.
Compression transforms data into a compact format. Vertica uses several different
compression methods and automatically chooses the best one for the data being
compressed. Using compression, Vertica stores more data, provides more views, and
uses less hardware than other databases. Using compression lets you keep much more
historical data in physical storage.
The following shows compression using sorting and cardinality:

For more information, see Data Encoding and Compression.
Clustering
Clustering supports scaling and redundancy. You can scale your database cluster by
adding more hardware, and you can improve reliability by distributing and replicating
data across your cluster.
Column data gets distributed across nodes in a cluster, so if one node becomes
unavailable the database continues to operate. When a node is added to the cluster, or
comes back online after being unavailable, it automatically queries other nodes to
update its local data.

Projections
A projection consists of a set of columns with the same sort order, defined by a column
to sort by or a sequence of columns by which to sort. Like an index or materialized view
in a traditional database, a projection accelerates query processing. When you write
queries in terms of the original tables, the query uses the projections to return query
results.
Projections are distributed and replicated across nodes in your cluster, ensuring that if
one node becomes unavailable, another copy of the data remains available. For more
information, see K-Safety.
Automatic data replication, failover, and recovery provide for active redundancy, which
increases performance. Nodes recover automatically by querying the system.
Continuous Performance
Vertica queries and loads data continuously 24x7.
Concurrent loading and querying provides real-time views and eliminates the need for
nightly load windows. On-the-fly schema changes allow you to add columns and
projections without shutting down your database; Vertica manages updates while
keeping the database available.
Terminology
It is helpful to understand the following terms when using Vertica:
Host
A computer system with a 32-bit (non-production use only) or 64-bit Intel or AMD
processor, RAM, hard disk, and TCP/IP network interface (IP address and hostname).
Hosts share neither disk space nor main memory with each other.
Instance
An instance of Vertica consists of the running Vertica process and disk storage (catalog
and data) on a host. Only one instance of Vertica can be running on a host at any time.
Node
A host configured to run an instance of Vertica. It is a member of the database cluster.
For a database to have the ability to recover from the failure of a node requires a
database K-safety value of at least 1 (3+ nodes).
Cluster

Refers to the collection of hosts (nodes) bound to a database. A cluster is not part of a
database definition and does not have a name.
Database
A cluster of nodes that, when active, can perform distributed data storage and SQL
statement execution through administrative, interactive, and programmatic user
interfaces.
Note: Although you can define more than one database on a cluster, Vertica
supports running only one database per cluster at a time.
Data Encoding and Compression
Vertica uses encoding and compression to optimize query performance and save
storage space.
Encoding
Encoding converts data into a standard format and increases performance because
there is less disk I/O during query execution. It also passes encoded values to other
operations, saving memory bandwidth. Vertica uses several encoding strategies,
depending on data type, table cardinality, and sort order. Vertica can directly process
encoded data.
Run the Database Designer for optimal encoding in your physical schema. The
Database Designer analyzes the data in each column and recommends encoding types
for each column in the proposed projections, depending on your design optimization
objective. For flex tables, Database Designer recommends the best encoding types for
any materialized flex table columns, but not for __raw__ column projections.
Compression
Compression transforms data into a compact format. Vertica uses integer packing for
unencoded integers and LZO for compressed data. Before Vertica can process
compressed data it must be decompressed.
Compression allows a column store to occupy substantially less storage than a row
store. In a column store, every value stored in a column of a projection has the same
data type. This greatly facilitates compression, particularly in sorted columns. In a row
store, each value of a row can have a different data type, resulting in a much less

effective use of compression. Vertica compresses flex table __raw__ column data by
about one half (1/2).
The efficient storage methods that Vertica uses for your database allows you to you
maintain more historical data in physical storage.
High Availability
Vertica provides high availability of the database using a RAID-like functionality. This
provides the following mechanisms to ensure little to no downtime:
l Multiple copies of the same data on different nodes.
l Vertica continues to load and query data if a node is down.
l Vertica automatically recovers missing data by querying other nodes.
K-Safety
K-safety sets the fault tolerance in the database cluster. The value K represents the
number of replicated data in the database cluster. These replicas allow other nodes to
take over query processing for any failed nodes.
In Vertica, the value of K can be zero (0), one (1), or two (2). If a database with a K-safety
of one (K=1) loses a node, the database continues to run normally. Potentially, the
database could continue running if additional nodes fail, as long as at least one other
node in the cluster has a copy of the failed node's data. Increasing K-safety to 2 ensures
that Vertica can run normally if any two nodes fail. When the failed node or nodes return
and successfully recover, they can participate in database operations again.
Note: If the number of failed nodes exceeds the K value, some the data may become
unavailable. In this case, the database is considered unsafe and automatically shuts
down. However, if every data segment is available on at least one functioning cluster
node Vertica continues to run safely.
Potentially, up to half the nodes in a database with a K-safety of 1 could fail without
causing the database to shut down. As long as the data on each failed node is available
from another active node, the database continues to run.
Note: If half or more of the nodes in the database cluster fail, the database
automatically shuts down even if all of the data in the database is available from

replicas. This behavior prevents issues due to network partitioning.
Note: The physical schema design must meet certain requirements. To create
designs that are K-safe, Vertica recommends using the Database Designer.
Buddy Projections
In order to determine the value of k-safety, Vertica creates buddy projections, which are
copies of segmented projections distributed across database nodes. (See Projection
Segmentation.) Vertica distributes segments that contain the same data to different
nodes. This ensures that if a node goes down, all the data is available on the remaining
nodes.
K-Safety Example
This diagram above shows a 5-node cluster with a K-safety level of 1. Each node
contains buddy projections for the data stored in the next higher node (node 1 has buddy
projections for node 2, node 2 has buddy projections for node 3, and so on). If any of the
nodes fails the database continues to run, though with lower performance, since one of
the nodes must handle its own workload and the workload of the failed node.
The diagram below shows a failure of Node 2. In this case, Node 1 handles processing
for Node 2 since it contains a replica of node 2's data. Node 1 also continues to perform
its own processing. The fault tolerance of the database falls from 1 to 0, since a single
node failure could cause the database to become unsafe. In this example, if either Node
1 or Node 3 fails, the database becomes unsafe because not all of its data is available. If
Node 1 fails,Node 2's data is no longer be available. If Node 3 fails, its data is no longer

available, because node 2 is down and could not use the buddy projection. In this case,
nodes 1 and 3 are considered critical nodes. In a database with a K-safety level of 1, the
node that contains the buddy projection of a failed node, and the node whose buddy
projections are on the failed node, always become critical nodes.
With Node 2 down, either node 4 or 5 could fail and the database still has all of its data
available. The diagram below shows that if node 4 fails, node 3 can use its buddy
projections to fill in for it. In this case, any further loss of nodes results in a database
shutdown, since all the nodes in the cluster are now critical nodes. In addition, if one
more node were to fail, half or more of the nodes would be down, requiring Vertica to
automatically shut down, no matter if all of the data were available or not.

In a database with a K-safety level of 2, Node 2 and any other node in the cluster could
fail and the database continues running. The diagram below shows that each node in
the cluster contains buddy projections for both of its neighbors (for example, Node 1
contains buddy projections for Node 5 and Node 2). In this case, nodes 2 and 3 could
fail and the database continues running. Node 1 could fill in for Node 2 and Node 4
could fill in for Node 3. Due to the requirement that half or more nodes in the cluster be
available in order for the database to continue running, the cluster could not continue
running if node 5 failed, even though nodes 1 and 4 both have buddy projections for its
data.
Monitoring K-Safety
You can access System Tables to monitor and log various aspects of Vertica operation.
Use the SYSTEM table to monitor information related to K-safety such as:
NODE_COUNT - The number of nodes in the cluster.
NODE_DOWN_COUNT - The number of nodes in the cluster that are currently down.
CURRENT_FAULT_TOLERANCE - The K-safety level.

High Availability With Projections
To ensure high availability and recovery for database clusters of three or more nodes,
Vertica:
l Replicates small, unsegmented projections
l Creates buddy projections for large, segmented projections.
Replication (Unsegmented Projections)
When it creates projections, Database Designer replicates them, creating and storing
duplicates of these projections on all nodes in the database.
Replication ensures:
l Distributed query execution across multiple nodes.
l High availability and recovery. In a K-safe database, replicated projections serve as
buddy projections. This means that you can use a replicated projection on any node
for recovery.
Note: We recommend you use Database Designer to create your physical schema.
If you choose not to, be sure to segment all large tables across all database nodes,
and replicate small, unsegmented table projections on all database nodes.
The following illustration shows two projections, B and C, replicated across a three node
cluster.
Buddy Projections (Segmented Projections)
Vertica creates buddy projections which are copies of segmented projections that are
distributed across database nodes (see Projection Segmentation.) Vertica distributes
segments that contain the same data to different nodes. This ensures that if a node goes

down, all the data is available on the remaining nodes. Vertica distributes segments to
different nodes by using offsets. For example, segments that comprise the first buddy
projection (A_BP1) are offset from projection A by one node, and segments from the
second buddy projection (A_BP2) are offset from projection A by two nodes.
The following diagram shows the segmentation for a projection called A and its buddy
projections, A_BP1 and A_BP2, for a three node cluster.
The following diagram shows how Vertica uses offsets to ensure that every node has a
full set of data for the projection.
How Result Sets Are Stored
Vertica duplicates table columns on all nodes in the cluster to ensure high availability
and recovery. Thus, if one node goes down in a K-Safe environment, the database
continues to operate using duplicate data on the remaining nodes. Once the failed node
resumes its normal operation, it automatically recovers its lost objects and data by
querying other nodes.
Vertica compresses and encodes data to greatly reduce the storage space. It also
operates on the encoded data whenever possible to avoid the cost of decoding. This
combination of compression and encoding optimizes disk space while maximizing
query performance.
Vertica Concepts

Vertica stores table columns as projections. This enables you to optimize the stored
data for specific queries and query sets. Vertica provides two methods for storing data:
l Projection segmentation is recommended for large tables (fact and large dimension
tables)
l Replication is recommended for the rest of the tables.
High Availability With Fault Groups
Use Fault groups to reduce the risk of correlated failures inherent in your physical
environment. Correlated failures occur when two or more nodes fail as a result of a
single failure. For example, failures due to shared resources such as power loss,
networking issues, or storage.
Vertica minimizes the risk of correlated failures by letting you define fault groups on your
cluster. Vertica then uses the fault groups to distribute data segments across the cluster,
so the database keeps running if a single failure event occurs.
Note: If your cluster layout is managed by a single network switch, a switch failure
would cause a single point of failure. Fault groups cannot help with single-point
failures.
Vertica supports complex, hierarchical fault groups of different shapes and sizes. Fault
groups are integrated with elastic cluster and large cluster arrangements to provide
added cluster flexibility and reliability.
Automatic fault groups
When you configure a cluster of 120 nodes or more, Vertica automatically creates fault
groups around control nodes. Control nodes are a subset of cluster nodes that manage
spread (control messaging). Vertica places nodes that share a control node in the same
fault group. See Large Cluster in the Administrator's Guide for details.
User-defined fault groups
Define your own default groups if your cluster layout has the potential for correlated
failures, or if you want to influence which cluster hosts manage control messaging.
Example cluster topology
The following diagram provides an example of hierarchical fault groups configured on a
single cluster:
Vertica Concepts

l Fault group FG–A contains nodes only
l Fault group FG-B (parent) contains child fault groups FG-C and FG-D. Each child fault
group also contain nodes.
l Fault group FG–E (parent) contains child fault groups FG-F and FG-G. The parent fault
group FG–E also contains nodes.
How to create fault groups
Before you define fault groups, you must have a thorough knowledge of your physical
cluster layout. Fault groups require careful planning.
To define fault groups create an input file of your cluster arrangement. Then pass the file
to a Vertica-supplied script, which returns the SQL statements you need to run. See
Fault Groups in the Administrator's Guide for details.
Vertica Concepts

Vertica Components
This section provides an overview of the components that make up Vertica:
l Logical Schema
l Physical Schema
l Database
l Management Console
l Administration Tools
These components allow you to tune and control your Vertica Analytics Platform with
minimal effort. This eliminates the time and effort a database administrator of a typical
database spends identifying issues.
Logical Schema
Design a logical schema for a Vertica database as you would for any SQL database. A
logical schema consist of objects such as:
l Schema
l Table
l View
l Referential Integrity
Vertica supports any relational schema design that you choose.
For more information, see Designing a Logical Schema in the Administrator's Guide.
Physical Schema
Unlike traditional databases where data is stored in tables, physical storage in Vertica
consists of collections of table columns called projections.
Projections store data in a format that optimizes query execution. They are similar to
materialized views in that they store result sets on disk rather than compute them each

time they are used in a query. Vertica automatically refreshes the result sets whenever
you update or load new data.
Projections provide the following benefits:
l Projections compress and encode data to reduce storage space. Additionally, Vertica
operates on the encoded data representation whenever possible to avoid the cost of
decoding. This combination of compression and encoding optimizes disk space
while maximizing query performance.
l Projections can be segmented or replicated across database nodes depending on
their size. For instance, projections for large tables can be segmented and distributed
across all nodes. Unsegmented projections for small tables can be replicated across
all nodes in the database.
l Projections are transparent to end-users of SQL. The Vertica query optimizer
automatically picks the best projections to use for a query.
l Projections provide high availability and recovery. To ensure high availability and
recovery, Vertica duplicates table columns on at least K+1 nodes in the cluster.
Therefore, if one machine fails in a K-Safe environment, the database continues to
operate using replicated data on the remaining nodes. When the node resumes its
normal operation, it automatically recovers its data and lost objects by querying other
nodes. See High Availability and Recovery for an overview of this feature and High
Availability With Projections for an explanation of how Vertica uses projections to
ensure high availability and recovery.
How Projections Are Created
For each table in the database, Vertica requires a minimum of one projection, called a
superprojection. A superprojection is a projection for a single table that contains all the
columns in the table.
To get your database operating quickly, Vertica automatically creates a superprojection
for each table in the database when you load or update data into that table. This
ensures that all SQL queries provide results.
Default superprojections do not exploit the full power of Vertica. Therefore, Vertica
recommends loading a sample of your data and then running the Database Designer to
create optimized projections. Database Designer creates new projections that optimize

your database based on its data statistics and the queries you use. The Database
Designer:
1. Analyzes your logical schema, sample data, and sample queries (optional)
2. Creates a physical schema design (projections) in the form of a SQL script that can
be deployed automatically or manually.
In most cases, the designs created by the Database Designer provide excellent query
performance within physical constraints. The Database Designer uses sophisticated
strategies to provide excellent ad-hoc query performance while using disk space
efficiently. If you prefer, you can design custom projections.
For more information about creating projections, see Creating a Database Design in the
Projection Anatomy
The CREATE PROJECTION statement defines the individual elements of a projection,
as shown in the following SQL example:
=> CREATE PROJECTION retail_sales_fact_P1 (
store_sales_fact_store_key ENCODING RLE ,
store_sales_fact_pos_transaction_number ENCODING RLE ,
store_sales_fact_sales_dollar_amount ,
store_sales_fact_cost_dollar_amount
)
AS SELECT
T_store_sales_fact.store_key,
T_store_sales_fact.pos_transaction_number,
T_store_sales_fact.sales_dollar_amount
T_store_sales_fact.cost_dollar_amount
FROM store_sales_fact T_store_sales_fact
ORDER BY T_store_sales_fact.store_key
SEGMENTED BY HASH (T_store_sales_fact.pos_transaction_number) ALL NODES OFFSET 1;
This SQL statement breaks down as follows:
Column List and Encoding
This portion of the SQL statement lists every column in the projection and defines the
encoding for each column. Vertica can operate on encoded data so HPE recommends
using data encoding because it results in less disk I/O.
=> CREATE PROJECTION retail_sales_fact_P1 (
store_sales_fact_store_key ENCODING RLE ,
store_sales_fact_pos_transaction_number ENCODING RLE ,
store_sales_fact_sales_dollar_amount ,
store_sales_fact_cost_dollar_amount

)
Base Query
This portion of the SQL statement identifies the columns to incorporate in the projection
using column name and table name references. The base query for large table
projections can contain PK/FK joins to smaller tables. See Pre-Join Projections and
Join Predicates.
AS SELECT
T_store_sales_fact.store_key,
T_store_sales_fact.pos_transaction_number,
T_store_sales_fact.sales_dollar_amount
T_store_sales_fact.cost_dollar_amount
FROM store_sales_fact T_store_sales_fact
Sort Order
This portion of the SQL statement determines the sort order. The sort order localizes
logically grouped values so that a disk read can identify many results at once. For
maximum performance, do not sort projections on LONG VARBINARY and LONG
VARCHAR columns. For more information see ORDER BY Clause
ORDER BY T_store_sales_fact.store_key
Segmentation
This portion of the SQL statement segments the projection across all nodes in the
database. This maximizes the database performance by distributing the load. For large
tables use SEGMENTED BY HASH to perform segmentation as in this example.
For small tables, use the UNSEGMENTED keyword to replicate these tables, rather
than segment them. Replication creates and stores identical copies of projections for
small tables across all nodes in the cluster. Replication ensures high availability and
recovery.
For maximum performance, do not segment projections on LONG VARBINARY and
LONG VARCHAR columns.
For more information see Projection Segmentation.
SEGMENTED BY HASH (T_store_sales_fact.pos_transaction_number) ALL NODES OFFSET 1;

Projection Segmentation
Projection segmentation splits individual projections into chunks of data of similar size,
called segments. One segment is created for and stored on each node. Projection
segmentation:
l Ensures high availability and recovery through K-Safety.
l Spreads the query execution workload across multiple nodes.
l Allows each node to be optimized for different query workloads.
Vertica segments large tables, to spread the query execution workload across multiple
nodes. Vertica replicates small tables, creating a duplicate of each unsegmented
projection on each node.
Hash Segmentation
Vertica uses hash segmentation to segment large projections. Hash segmentation
allows you to segment a projection based on a built-in hash function that provides even
distribution of data across multiple nodes, resulting in optimal query execution. In a
projection, the data to be hashed consists of one or more column values, each having a
large number of unique values and an acceptable amount of skew in the value
distribution. Primary key columns that meet the criteria could be an excellent choice for
hash segmentation.
Database
This section covers the following database elements:
l Database Setup
l Database Connections
l Database Security
l Database Designer
l Data Loading
l Workload Management
l Database Locks

Database Setup
This page provides an overview on setting up a Vertica database . For complete details
see Configuring the Database.
Prepare SQL Scripts and Data Files
Prepare the following files before installing Vertica:
l Logical schema script
l Loadable data files
l Load scripts
l Sample query script (training set)
Create the Database
Create the database after installing Vertica on at least one host:
l Use the Administration Tools to:
n Create a database
n Connect to the database
l Use the Database Designer to design the physical schema.
l Use the vsql interactive interface to run SQL scripts that:
n Create tables and constraints
n Create projections
Test the Empty Database
l Test for sufficient projections using the sample query script
l Test the projections for K-safety

Test the Partially-Loaded Database
l Load the dimension tables
l Partially load the fact table
l Check system resource usage
l Check query execution times
l Check projection usage
Complete the Fact Table Load
l Monitor system usage
l Complete the fact table load
Set up Security
For security-related tasks, see Implementing Security.
l [Optional] Set up SSL
l [Optional] Set up client authentication
l Set up database users and privileges
Set up Incremental Loads
Set up periodic (trickle) loads, see Trickle Loading
Database Connections
You can connect to a Vertica database in the following ways:
l Interactively using the vsql client, as described in Using vsql in the Administrator's
Guide.
vsql is a character-based, interactive, front-end utility that lets you type SQL
statements and see the results. It also provides a number of meta-commands and
various shell-like features that facilitate writing scripts and automating a variety of
tasks.

You can run vsql on any node within a database. To start vsql, use the Administration
Tools or the shell command described in Using vsql.
l Programmatically using the JDBC driver provided by Vertica, as described in
Programming JDBC Client Applications in Connecting to Vertica.
An abbreviation for Java Database Connectivity, JDBC is a call-level application
programming interface (API) that provides connectivity between Java programs and
data sources (SQL databases and other non-relational data sources, such as
spreadsheets or flat files). JDBC is included in the Java 2 Standard and Enterprise
editions.
l Programmatically using the ODBC driver provided by Vertica, as described in
Programming ODBC Client Applications in Connecting to Vertica.
An abbreviation for Open DataBase Connectivity, ODBC is a standard application
programming interface (API) for access to database management systems.
l Programmatically using the ADO.NET driver provided by Vertica, as described in
Programming ADO.NET Applications in Connecting to Vertica.
The Vertica driver for ADO.NET allows applications written in C# and Visual Studio
to read data from, update, and load data into Vertica databases. It provides a data
adapter that facilitates reading data from a database into a data set, and then writing
changed data from the data set back to the database. It also provides a data reader
(VerticaDataReader) for reading data and autocommit functionality for committing
transactions automatically.
l Programmatically using Perl and the DBI driver, as described in Programming Perl
Client Applications in Connecting to Vertica.
Perl is a free, stable, open source, cross-platform programming language licensed
under its Artistic License, or the GNU General Public License (GPL).
l Programmatically using Python and the Vertica Python Client or the pyodbc driver,
as described in Programming Python Client Applications in Connecting to Vertica.
Python is a free, agile, object-oriented, cross-platform programming language
designed to emphasize rapid development and code readability.

HPE recommends that you deploy Vertica as the only active process on each machine
in the cluster and connect to it from applications on different machines. Vertica expects
to use all available resources on the machine, and to the extent that other applications
are also using these resources, suboptimal performance could result.
Database Security
Vertica secures access to the database and its resources by enabling you to control
user access to the database and which tasks users are authorized to perform. See
Implementing Security.
Database Designer
Vertica's Database Designer is a tool that:
l Analyzes your logical schema, sample data, and, optionally, your sample queries.
l Creates a Physical Schema design that can be deployed automatically or manually.
l Can be used by anyone without specialized database knowledge. Even business
users can run Database Designer.
l Can be run and re-run any time for additional optimization without stopping the
database.
Run the DBD
Run the Database Designer in one of the following ways:
l With the Management Console, as described in Using Management Console to
Create a Design
l Programmatically, using the steps described in About Running Vertica
Programmatically.
l With the Administration Tools by selecting Configuration Menu > Run Database
Designer. For details, see Using the Administration Tools to Create a Design
Use the Database Designer to create one of the following types of designs:
l A comprehensive design that allows you to create new projections for all tables in
your database.

l An incremental design that creates projections for all tables referenced in the queries
you supply.
Database Designer benefits include:
l Accepting up to 100 queries in the query input file for an incremental design.
l Accepting unlimited queries for a comprehensive design.
l Producing higher quality designs by considering UPDATE and DELETE statements.
In most cases, the designs created by Database Designer provide optimal query
performance within physical constraints. Database Designer uses sophisticated
strategies to provide optimal query performance and data compression.
See Also
l Physical Schema
l Creating a Database Design
Data Loading
The SQL Data Manipulation Language (DML) commands INSERT, UPDATE, and
DELETE perform the same functions in Vertica as they do in row-oriented databases.
These commands follow the SQL-92 transaction model and can be intermixed.
Use the COPY statement for bulk loading data. COPY reads data from text files or data
pipes and inserts it into WOS (memory) or directly into the ROS (disk). COPY can load
compressed formats such as GZIP and LZO. COPY automatically commits itself and
any current transaction but is not atomic; some rows could be rejected. Note that COPY
does not automatically commit when copying data into temporary tables.
You can use the COPY statement's NO COMMIT option to prevent COPY from
committing a transaction when it finishes copying data. This allows you to ensure the
data in the bulk load is either committed or rolled back at the same time. Also,
combining multiple smaller data loads into a single transaction allows Vertica to load
the data more efficiently. See the COPY statement in the SQL Reference Manual for
more information.
You can use multiple, simultaneous database connections to load and/or modify data.
For more information about bulk loading, see Bulk Loading Data.

Workload Management
Vertica's resource management scheme allows diverse, concurrent workloads to run
efficiently on the database. For basic operations, Vertica pre-configures the built-in
GENERAL pool based on RAM and machine cores. You can customize the General
pool to handle specific concurrency requirements.
You can also define new resource pools that you configure to limit memory usage,
concurrency, and query priority. You can then optionally assign each database user to
use a specific resource pool, which controls memory resources used by their requests.
User-defined pools are useful if you have competing resource requirements across
different classes of workloads. Example scenarios include:
l A large batch job takes up all server resources, leaving small jobs that update a web
page without enough resources. This can degrade user experience.
In this scenario, create a resource pool to handle web page requests and ensure
users get resources they need. Another option is to create a limited resource pool for
the batch job, so the job cannot use up all system resources.
l An application has lower priority than other applications and you want to limit the
amount of memory and number of concurrent users for the low-priority application.
In this scenario, create a resource pool with an upper limit on the query's memory and
associate the pool with users of the low-priority application.
For more information, see Managing Workload Resources in the Administrator's Guide.
Vertica Database Locks
In an environment where multiple users concurrently access the same database
information, data manipulation can cause conflicts and threaten data integrity.
Conflicts occur because some transactions block other operations until the transaction
completes. Because transactions committed at the same time should produce consistent
results, Vertica uses locks to maintain data concurrency and consistency. Vertica
automatically controls locking by limiting the actions a user can take on an object,
depending on the state of that object.

Vertica uses object locks and system locks. Object locks are used on objects, such as
tables and projections. System locks include global catalog locks, local catalog locks,
and elastic cluster locks.
Lock Modes
Vertica has different lock modes that determine how a lock acts on an object. Each lock
mode has a lock compatibility and lock strength that reflect how it interacts with other
locks in the same environment.
Lock Mode Description
S - Shared Use a Shared lock for SELECT queries that run at the
serialized transaction isolation level. This allows queries
to run concurrently, but the S lock creates the effect that
transactions are running in serial order. The S lock
ensures that one transaction does not affect another
transaction until one transaction completes and its S lock
is released.
Select operations in READ COMMITTED transaction
mode do not require S table locks. See Transactions in
Vertica Concepts for more information.
I - Insert Vertica requires an Insert lock to insert data into a table.
Multiple transactions can lock an object in Insert mode
simultaneously, enabling multiple inserts and bulk loads
to occur at the same time. This behavior is critical for
parallel loads and high ingestion rates.
SI - Shared Insert Vertica requires a Shared Insert lock when both a read
and an insert occur in a transaction. Shared Insert mode
prohibits delete/update operations. An SI lock also results
from lock promotion.
X - Exclusive Vertica uses Exclusive locks when performing deletes
and updates. Only mergeout and moveout operations (U
locks) can run concurrently on objects with X locks.
T - Tuple Mover The Tuple Mover uses T locks for operations on delete

vectors. Tuple Mover operations upgrade the table lock
mode from U to T when work on delete vectors starts so
that no other updates or deletes can happen concurrently.
T locks are also used for COPY into pre-join projections.
U - Usage Vertica uses Usage locks for moveout and mergeout
Tuple Mover operations. These Tuple Mover operations
run automatically in the background, therefore, most other
operations (except those requiring an O lock) can run
when the object is locked in U mode.
O - Owner An O lock is the strongest Vertica lock mode. An object
acquires an Owner lock when it undergoes changes in
both data and structure. Such changes can occur in some
DDL operations, such as DROP_PARTITION,
TRUNCATE TABLE, and ADD COLUMN. When an
object is locked in O mode, it cannot be locked
simultaneously by another transaction in any mode.
IV - Insert-Validate An Insert Validate lock is needed for insert operations
where the system performs constraint validation for
enabled PRIMARY or UNIQUE key constraints.
Lock Compatibility
Lock compatibility refers to having two locks in effect on the same object at the same
time.
Lock Compatibility Matrix
This matrix shows which locks can be used on the same object simultaneously.
When two lock modes intersect in a Yes cell, those modes are compatible. If two
requested modes intersect in a No cell, the second request is not granted until the first
request releases its lock.
Granted Mode
Requested
Mode
S I IV SI X T U O

S Yes No No No No Yes Yes No
I No Yes Yes No No Yes Yes No
IV No Yes No No No Yes Yes No
SI No No No No No Yes Yes No
X No No No No No No Yes No
T Yes Yes Yes Yes No Yes Yes No
U Yes Yes Yes Yes Yes Yes Yes No
O No No No No No No No No
Lock Upgrade Matrix
This matrix shows how your object lock responds to an INSERT request.
If an object has an S lock and you want to do an INSERT, your transaction requests an
SI lock. However, if an object has an S lock and you want to perform an operation that
requires an S lock, no lock request is issued.
Granted Mode
Requested Mode S I IV SI X T U O
S S SI SI SI X S S O
I SI I IV SI X I I O
IV SI IV IV SI X IV IV O
SI SI SI SI SI X SI SI O

X X X X X X X X O
T S I IV SI X T T O
U S I IV SI X T U O
O O O O O O O O O
Lock Strength
Lock strength refers to the ability of a lock mode to interact with another lock mode. The
strongest lock mode, the O lock, is not compatible with any other lock mode.
Conversely, a U lock is the weakest lock mode and the only operation that cannot run
concurrently is one that requires an O lock.
This figure provides a visual representation of lock mode strength:
See Also:
Lock Example
LOCKS
Lock Example
In this example, two sessions (A and B) are active and attempting to act on Table T1. At
the beginning of the example, table T1 has one column (C1) and no rows.
The steps here represent a possible series of transactions from sessions A and B:
1. Transactions in both sessions acquire S locks to read from Table T1.
2. Session B releases its S lock with the COMMIT statement.

3. Session A can upgrade to an SI lock and insert into T1 since Session B released its
S lock.
4. Session A releases its SI lock with a COMMIT statement.
5. Both sessions can acquire S locks because Session A released its SI lock.
6. Session A can’t acquire an SI lock because Session B has not released its S lock.
SI locks are incompatible with S locks.
7. Session B releases its S lock with the COMMIT statement.
8. Session A can now upgrade to an SI lock and insert into T1.
9. Session B attempts to delete a row from T1 but can’t acquire an X lock because
Session A has not released its SI lock. SI locks are incompatible with X locks.
10. Session A continues to insert into T1.
11. Session A releases its SI lock.
12. Session B can now acquire an X lock and perform the delete.
This figure illustrates the previous steps:

See Also:
Lock Modes
Troubleshooting Locks
The LOCKS and LOCK_USAGE system tables can help identify problems you may
encounter with Vertica database locks.
This example shows one row from the LOCKS system table. From this table you can
see what types of locks are active on specific objects and nodes.
=> SELECT node_names, object_name, lock_mode, lock_scope FROM LOCKS;
node_names | object_name | lock_mode | lock_scope
-----------+---------------------------------+-----------+-----------
node0001 | Table:public.customer_dimension | X | TRANSACTION
This example shows two rows from the LOCKS_USAGE system table. You can also
use this table to see what locks are in use on specific objects and nodes.
=> SELECT node_name, object_name, mode FROM LOCK_USAGE;
node_name | object_name | mode
----------+------------------+-------
node0001 | Cluster Topology | S
node0001 | Global Catalog | X

Management Console
Management Console (MC) is a user-friendly performance monitoring and management
tool that provides a unified view of your Vertica database operations. Using a browser,
you can create, import, manage, and monitor one or more databases and their
associated clusters. You can also create and manage MC users. You can then map the
MC users to a Vertica database and manage them through the MC interface.
What You Can Do with Management Console
Create...
A database cluster on hosts that do not have Vertica installed
Multiple Vertica databases on one or more clusters from a single point of control
MC users and grant them access to MC and databases managed by MC
Configure...
Database parameters and user settings dynamically
Resource pools
Monitor...
License usage and conformance
Dynamic metrics about your database cluster
Resource pools
User information and activity on MC
Alerts by accessing a single message box of alerts for all managed databases
Recent databases and clusters through a quick link
Multiple Vertica databases on one or more clusters from a single point of control
Import or Export...
Export all database messages or log/query details to a file
Troubleshoot...

Create...
Configure...
Monitor...
Import or Export...
Import multiple Vertica databases on one or more clusters from a single point of
control
Troubleshoot...
MC-related issues through a browser
Management Console provides some, but not all, the functionality that Administration
Tools provides. Management Console also includes extended functionality not
available in admintools. This additional functionality includes a graphical view of your
Vertica database and detailed monitoring charts and graphs. See Administration Tools
and Management Console in the Administrator's Guide for more information.
Getting MC
Download the Vertica server RPM and the MC package from myVertica Portal. You then
have two options:
l Install Vertica and MC at the command line and import one or more Vertica database
clusters into the MC interface
l Install Vertica directly through MC
See the Installation Guide for details.
What You Need to Know
If you plan to use MC, review the following topics in the Administrator's Guide:
Create a new, empty Vertica database Create a Database on a Cluster
Import an existing Vertica database cluster into
MC
Managing Database Clusters

Understand how MC users differ from database
users
About MC Users
Read about the MC privilege model About MC Privileges and Roles
Create new MC users Creating an MC User
Grant MC users privileges on one or more Vertica
databases managed by MC
Granting Database Access to MC
Users
Use Vertica functionality through the MC interface Using Management Console
Monitor MC and Vertica databases managed by
MC
Monitoring Vertica Using
Management Console
Monitor and configure Resource Pools Monitoring Resource Pools in
Management Console
Management Console Architecture
MC accepts HTTP requests from a client web browser, gathers information from the
Vertica database cluster, and returns that information to the browser for monitoring.
MC Components
The primary components that drive Management Console are an application/web server
and agents that get installed on each node in the Vertica cluster.
The following diagram is a logical representation of MC, the MC user's interface, and the
database cluster nodes.

Application/web Server
The application server hosts MC's web application and uses port 5450 for node-to-MC
communication and to perform the following:
l Manage one or more Vertica database clusters
l Send rapid updates from MC to the web browser
l Store and report MC metadata, such as alerts and events, current node state, and MC
users, on a lightweight, embedded (Derby) database
l Retain workload history
MC Agents
MC agents are internal daemon process that run on each Vertica cluster node. The
default agent port, 5444, must be available for MC-to-node and node-to-node
communications. Agents monitor MC-managed Vertica database clusters and
communicate with MC to provide the following functionality:
l Provide local access, command, and control over database instances on a given
node, using functionality similar to Administration Tools.
l Report log-level data from the Administration Tools and Vertica log files.
l Cache details from long-running jobs—such as create/start/stop database
operations—that you can view through your browser.
l Track changes to data-collection and monitoring utilities and communicate updates
to MC .
l Communicate between all cluster nodes and MC through a webhook subscription,
which automates information sharing and reports on cluster-specific issues like node
state, alerts,and events.
See Also
l Monitoring Vertica Using MC

Management Console Security
The Management Console (MC) manages multiple Vertica clusters, all which might
have different levels and types of security, such as user names and passwords and
LDAP authentication. You can also manage MC users who have varying levels of
access across these components.
Open Authorization and SSL
Management Console (MC) uses a combination of OAuth (Open Authorization), Secure
Socket Layer (SSL), and locally-encrypted passwords to secure HTTPS requests
between a user's browser and MC, and between MC and the agents. Authentication
occurs through MC and between agents within the cluster. Agents also authenticate and
authorize jobs.
The MC configuration process sets up SSL automatically, but you must have the
openssl package installed on your Linux environment first.
See the following topics in the in the Administrator's Guide for more information:
l SSL Overview
l SSL Authentication
l Generating Certificates and Keys for MC
l Importing a New Certificate to MC
User Authentication and Access
MC provides two user authentication methods, LDAP or MC. You can use only one
method at a time. For example, if you chose LDAP, all MC users will be authenticated
against your organization's LDAP server.
You set LDAP authentication up through MC Settings > Authentication on the MC
interface.
Note: MC uses LDAP data for authentication purposes only. It does not modify user
information in the LDAP repository.
The MC authentication method stores MC user information internally and encrypts
passwords. These MC users are not system (Linux) users. They are accounts that have

access to MC and, optionally, to one or more MC-managed Vertica databases through
the MC interface.
Management Console also has rules for what users can see when they sign in to MC
from a client browser. These rules are governed by access levels, each of which is
made up of a set of roles.
See Also
l About MC Users
l About MC Privileges and Roles
l Creating an MC User
Management Console Home Page
The MC Home page is the entry point to all MC-managed Vertica database clusters and
MC users. User access levels determine what a user can see on the MC Home page.
Layout and navigation are described in Using Management Console.
Administration Tools
The Vertica Administration tools allow you to easily perform administrative tasks. You
can perform most Vertica database administration tasks with Administration Tools.
Run Administration Tools using the Database Administrator account on the
Administration host, if possible. Make sure that no other Administration Tools processes
are running.
If the Administration host is unresponsive, run Administration Tools on a different node
in the cluster. That node permanently takes over the role of Administration host.
Any user can view the man page available for admintools. Enter the following:
man admintools
Running Administration Tools
As dbadmin user, you can run administration tools. The syntax follows:
/opt/vertica/bin/admintools [
{ -h | --help }
| { -a | --help_all}

| [--debug ]
| { -t | --tool } name_of_tool[ options]
]
Options
-h
--help
Outputs abbreviated help.
-a
--help_all
Outputs verbose help, which lists all command-line sub-
commands and options as shown in the Tools section below.
--debug If you include the debug option, Vertica logs debug
information.
Note: You can specify the debug option with or without
naming a specific tool. If you specify debug with a specific
tool, Vertica logs debug information during tool execution.
If you do not specify a tool, Vertica logs debug information
when you run tools through the admintools user interface.
{ -t | --tool }
name_of_tool
[options]
Specifies the tool to run, where name_of_tool is one of the
tools described in the help output, and options are one or
more comma-delimited tool arguments.
Note: Enter admintools -h to see the list of tools
available. Enter admintools -t name_of_tool --
help to review a specific tool's options.
An unqualified admintools command displays the Main Menu dialog box.

If you are unfamiliar with this type of interface, read Using the Administration Tools
Interface
First Login as Database Administrator
The first time you log in as the Database Administrator and run the Administration Tools,
the user interface displays.
1. In the end-user license agreement (EULA ) window, type accept to proceed.
A window displays, requesting the location of the license key file you downloaded
click OK.
Between Dialogs
While the Administration Tools are working, you see the command line processing in a
window similar to the one shown below. Do not interrupt the processing.

SQL in Vertica
Vertica offers a robust set of SQL elements that allow you to manage and analyze
massive volumes of data quickly and reliably. Vertica uses the following:
SQL Language Elements including:
l Keywords and Reserved Words
l Identifiers
l Literals
l Operators
l Expressions
l Predicates
l Hints
SQL Data Types including:
l Binary
l Boolean
l Character
l Date/Time
l Long
l Numeric
SQL Functionsincluding Vertica-specific functions that take advantage of Vertica's
unique column-store architecture. For example, use the ANALYZE_HISTOGRAM
function to collect and aggregate a variable amount of sample data for statistical
analysis.
SQL Statements that allow you to write robust queries to quickly return large volumes of
data.

About Query Execution
When you submit a query, the initiator quickly chooses the projections to use, optimizes
and plans the query execution, and logs the SQL statement to its log. This planning
results in an Explain Plan. The Explain Plan maps out the steps the query performs.
You can view it in the Management Console.
The optimizer breaks down the Explain Plan into smaller plans distributed to Executor
Node
In the final stages of query plan execution, the initiator node does the following:
l Combines results in a grouping operation
l Merges multiple sorted partial result sets from all the executors
l Formats the results to return to the client
For detailed information about writing and executing queries, see Queries in Analyzing
Data.
Backup Isolation Mode
Vertica can run any SQL query in snapshot isolation mode in order to obtain the fastest
possible execution. To be precise, snapshot isolation mode is actually a form of a
historical query. The syntax is:
AT EPOCH LATEST SELECT...
The command queries all data in the database up to but not including the current epoch
without holding a lock or blocking write operations, which could cause the query to miss
rows loaded by other users up to (but no more than) a specific number of minutes before
execution.
Historical Queries
Vertica can run a query from a snapshot of the database taken at a specific date and
time or at a specific epoch. The syntax is:
AT TIME 'timestamp' SELECT...
AT EPOCH epoch_number SELECT...
AT EPOCH LATEST SELECT...

The command queries all data in the database up to and including the specified epoch
or the epoch representing the specified date and time, without holding a lock or blocking
write operations. The specified TIMESTAMP and epoch_number values must be
greater than or equal to the Ancient History Mark epoch.
Historical queries are useful because they access data in past epochs only. Historical
queries do not need to hold table locks or block write operations because they do not
return the absolute latest data. Their content is private to the transaction and valid only
for the length of the transaction.
Historical queries behave in the same manner regardless of transaction isolation level.
Historical queries observe only committed data, even excluding updates made by the
current transaction, unless those updates are to a temporary table.
Be aware that there is only one backup of the logical schema. This means that any
changes you make to the schema are reflected across all epochs. If, for example, you
add a new column to a table and you specify a default value for the column, all historical
epochs display the new column and its default value.
The DELETE command in Vertica does not actually delete data; it marks records as
deleted. (The UPDATE command is actually a combined INSERT and a DELETE.)
Thus, you can control how much deleted data is stored on disk. For more information,
see Managing Disk Space in the Administrator's Guide.

Transactions
When transactions in multiple user sessions concurrently access the same data,
session-scoped isolation levels determine what data each transaction can access.
A transaction retains its isolation level until it completes, even if the session's isolation
level changes during the transaction. Vertica internal processes (such as the Tuple
Mover and refresh operations) and DDL operations always run at the SERIALIZABLE
isolation level to ensure consistency.
The Vertica query parser supports standard ANSI SQL-92 isolation levels as follows:
l READ UNCOMMITTED : Automatically interpreted as READ COMMITTED.
l READ COMMITTED (default)
l REPEATABLE READ: Automatically interpreted as SERIALIZABLE
l SERIALIZABLE
Transaction isolation levels READ, COMMITTED, and SERIALIZABLE differ as follows:
Isolation level Dirty read Non-repeatable read Phantom read
READ COMMITTED
Not Possible Possible Possible
SERIALIZABLE
Not Possible Not Possible Not Possible
You can set separate isolation levels for the database and individual transactions.
Implementation Details
Vertica supports conventional SQL transactions with standard ACID properties:
l ANSI SQL 92 style-implicit transactions. You do not need to run a BEGIN or START
TRANSACTION command.
l No redo/undo log or two-phase commits.
l The COPY command automatically commits itself and any current transaction (except
when loading temporary tables). It is generally good practice to commit or roll back

the current transaction before you use COPY. This step is optional for DDL statements,
which are auto-committed.
Rollback
Transaction rollbacks restore a database to an earlier state by discarding changes
made by that transaction. Statement-level rollbacks discard only the changes initiated
by the reverted statements. Transaction-level rollbacks discard all changes made by the
transaction.
With a ROLLBACK statement, you can explicitly roll back to a named savepoint within the
transaction, or discard the entire transaction. Vertica can also initiate automatic
rollbacks in two cases:
l An individual statement returns an ERROR message. In this case, Vertica rolls back
the statement.
l DDL errors, systemic failures, dead locks, and resource constraints return a
ROLLBACK message. In this case, Vertica rolls back the entire transaction.
Explicit and automatic rollbacks always release any locks that the transaction holds.
Savepoints
A savepoint is a special marker inside a transaction that allows commands that execute
after the savepoint to be rolled back. The transaction is restored to the state that
preceded the savepoint.
Vertica supports two types of savepoints:
l An implicit savepoint is automatically established after each successful command
within a transaction. This savepoint is used to roll back the next statement if it returns
an error. A transaction maintains one implicit savepoint, which it rolls forward with
each successful command. Implicit savepoints are available to Vertica only and
cannot be referenced directly.
l Named savepoints are labeled markers within a transaction that you set through
SAVEPOINT statements. A named savepoint can later be referenced in the same
transaction through RELEASE SAVEPOINT, which destroys it, and ROLLBACK TO
SAVEPOINT, which rolls back all operations that followed the savepoint. Named

savepoints can be especially useful in nested transactions: a nested transaction that
begins with a savepoint can be rolled back entirely, if necessary.
READ COMMITTED Isolation
When you use the isolation level READ COMMITTED, a SELECT query obtains a backup of
committed data at the transaction's start. Subsequent queries during the current
transaction also see the results of uncommitted updates that already executed in the
same transaction.
When you use DML statements, your query acquires write locks to prevent other READ
COMMITTED transactions from modifying the same data. However, be aware that SELECT
statements do not acquire locks, so concurrent transactions can obtain read and write
access to the same selection.
READ COMMITTED is the default isolation level. For most queries, this isolation level
balances database consistency and concurrency. However, this isolation level can
allow one transaction to change the data that another transaction is in the process of
accessing. Such changes can yield nonrepeatable and phantom reads. You may have
applications with complex queries and updates that require a more consistent view of
the database. If so, use SERIALIZABLE isolation instead.
The following figure shows how READ COMMITTED isolation might control how
concurrent transactions read and write the same data:

READ COMMITTED isolation maintains exclusive write locks until a transaction ends, as
shown in the following graphic:

See Also
l Vertica Database Locks
l LOCKS
l SET SESSION CHARACTERISTICS
l Configuration Parameters
SERIALIZABLE Isolation
SERIALIZABLE is the strictest SQL transaction isolation level. While this isolation level
permits transactions to run concurrently, it creates the effect that transactions are
running in serial order. Transactions acquire locks for read and write operations. Thus,
successive SELECT commands within a single transaction always produce the same
results. Because SERIALIZABLE isolation provides a consistent view of data, it is useful

for applications that require complex queries and updates. However, serializable
isolation reduces concurrency. For example, it blocks queries during a bulk load.
SERIALIZABLE isolation establishes the following locks:
l Table-level read locks: Vertica acquires table-level read locks on selected tables and
releases them when the transaction ends. This behavior prevents one transaction
from modifying rows while they are being read by another transaction.
l Table-level write lock: Vertica acquires table-level write locks on update and
releases them when the transaction ends. This behavior prevents one transaction
from reading another transaction's changes to rows before those changes are
committed.
At the start of a transaction, a SELECT statement obtains a backup of the selection's
committed data. The transaction also sees the results of updates that are run within the
transaction before they are committed.
The following figure shows how concurrent transactions that both have SERIALIZABLE
isolation levels handle locking:

Applications that use SERIALIZABLE must be prepared to retry transactions due to
serialization failures. Such failures often result from deadlocks. When a deadlock
occurs, any transaction awaiting a lock automatically times out after 5 minutes. The
following figure shows how deadlock might occur and how Vertica handles it:

Note: SERIALIZABLE isolation does not apply to temporary tables. No locks are
required for these tables because they are isolated by their transaction scope.
See Also Vertica
l Vertica Database Locks
l LOCKS

Extending Vertica
Vertica lets you extend its capabilities to perform new operations or handle new data
types, using the following:
l User-Defined SQL Function allows you to store frequently-used SQL statements.
l User-Defined Extensions and User-Defined Functions allows you to develop analytic
or data-loading tools using programming languages C++, Java, and R.
l External Procedures allows you to run external scripts installed in your database
cluster.
User-Defined SQL Functions
User-defined SQL functions allow you to create and store commonly-used SQL
statements. You can use a user-defined SQL function anywhere in a query where an
ordinary SQL statement can be used. You need USAGE privileges on the schema and
EXECUTE privileges on the function to run a user-defined SQL function.
For information on creating and managing user-defined SQL functions see Using User-
Defined SQL Functions.
User-Defined Extensions and User-Defined
Functions
User-Defined Extension (UDx) refers to all extensions to Vertica developed using the
APIs in the Vertica SDK. UDxs encompass functions such as User-Defined Scalar
Functions (UDSFs), and utilities such as the User-Defined Load (UDL) feature that let
you create custom data load routines.
Thanks to their tight integration with Vertica, UDxs usually have better performance than
User-defined SQL functions or External Procedures.
User-Defined Functions (UDFs) are a specific type of UDx. You use them in SQL
statements to process data similarly to Vertica's own built-in functions. They give you

the power of creating your own functions that run just slightly slower than Vertica's own
function.
The Vertica SDK uses the term UDx extensively, even for APIs that deal exclusively
with developing UDFs.
External Procedures
External procedures allow you to call a script or executable program stored in your
database cluster. You can pass literal values to this external procedure as arguments.
The external procedure cannot communicate back to Vertica.
For information on creating and using external procedures see Using External
Procedures.

International Languages and Character
Sets
This section describes how Vertica handles internationalization and character sets.
Unicode Character Encoding
UTF-8 is an abbreviation for Unicode Transformation Format-8 (where 8 equals 8-bit)
and is a variable-length character encoding for Unicode created by Ken Thompson and
Rob Pike. UTF-8 can represent any universal character in the Unicode standard, yet the
initial encoding of byte codes and character assignments for UTF-8 is coincident with
ASCII (requiring little or no change for software that handles ASCII but preserves other
values).
All input data received by the database server is expected to be in UTF-8, and all data
output by Vertica is in UTF-8. The ODBC API operates on data in UCS-2 on Windows
systems, and normally UTF-8 on Linux systems. JDBC and ADO.NET APIs operate on
data in UTF-16. The client drivers automatically convert data to and from UTF-8 when
sending to and receiving data from Vertica using API calls. The drivers do not transform
data loaded by executing a COPY or COPY LOCAL statement.
See Implement Locales for International Data Sets in the Administrator's Guide for
details.
Locales
The locale is a parameter that defines the user's language, country, and any special
variant preferences, such as collation. Vertica uses the locale to determine the behavior
of certain string functions. The locale also determines the collation for various SQL
commands that require ordering and comparison, such as GROUP BY, ORDER BY,
joins, and the analytic ORDER BY clause.
By default, the locale for your Vertica database is en_US@collation=binary (English
US). You can define a new default locale that is used for all sessions on the database.
You can also override the locale for individual sessions. However, projections are
always collated using the default en_US@collation=binary collation, regardless of
the session collation. Any locale-specific collation is applied at query time.
You can set the locale through ODBC, JDBC, and ADO.net.

See the following topics in the Administrator's Guide for details:
l Implement Locales for International Data Sets
l Supported Locales in the Appendix
String Functions
Vertica provides string functions to support internationalization. Unless otherwise
specified, these string functions can optionally specify whether VARCHAR arguments
should be interpreted as octet (byte) sequences, or as (locale-aware) sequences of
characters. This is accomplished by adding "USING OCTETS" and "USING
CHARACTERS" (default) as a parameter to the function.
See String Functions for details.
Character String Literals
By default, string literals ('...') treat back slashes literally, as specified in the SQL
standard.
Tip: If you have used previous releases of Vertica and you do not want string literals
to treat back slashes literally (for example, you are using a back slash as part of an
escape sequence), you can turn off the StandardConformingStrings
configuration parameter. See Internationalization Parameters in the Administrator's
Guide. You can also use the EscapeStringWarning parameter to locate back
slashes which have been incorporated into string literals, in order to remove them.
See Character String Literals for details.

Administrator's Guide

Administration Overview
This document describes the functions performed by an Vertica database administrator
(DBA). Perform these tasks using only the dedicated database administrator account
that was created when you installed Vertica. The examples in this documentation set
assume that the administrative account name is dbadmin.
l To perform certain cluster configuration and administration tasks, the DBA (users of
the administrative account) must be able to supply the root password for those hosts.
If this requirement conflicts with your organization's security policies, these functions
must be performed by your IT staff.
l If you perform administrative functions using a different account from the account
provided during installation, Vertica encounters file ownership problems.
l If you share the administrative account password, make sure that only one user runs
the Administration Tools at any time. Otherwise, automatic configuration propagation
does not work correctly.
l The Administration Tools require that the calling user's shell be /bin/bash. Other
shells give unexpected results and are not supported.

Managing Licenses
You must license Vertica in order to use it. Hewlett Packard Enterprise supplies your
license in the form of one or more license files, which encode the terms of your license.
The licenses that are available include:
l vlicense.dat, for columnar tables.
l vlicense_565_bytes.dat, for data stored in a Hadoop environment with Vertica for
SQL on Hadoop.
To prevent introducing special characters that invalidate the license, do not open the
license files in an editor or email client. Opening the file in this way can introduce
special characters, such as line endings and file terminators, that may not be visible
within the editor. Whether visible or not, these characters invalidate the license.
Applying License Files
For ease of Vertica Premium Edition and SQL on Hadoop installation, HPE
recommends that you copy the license file to /tmp/vlicense.dat on the
Administration host.
Be careful not to change the license key file in any way when copying the file between
Windows and Linux, or to any other location. To help prevent applications from trying to
alter the file, enclose the license file in an archive file (such as a .zip or .tar file).
After copying the license file from one location to another, check that the copied file size
is identical to that of the one you received from Vertica.
Obtaining a License Key File
To obtain a license key (for example, for Premium Edition or SQL on Hadoop), contact
Vertica at: http://www.vertica.com/about/contact-us/
Your Vertica Community Edition download package includes the Community Edition
license, which allows three nodes and 1TB of data. The Vertica Community Edition
license does not expire.
Understanding Vertica Licenses
Vertica has flexible licensing terms. It can be licensed on the following bases:

l Term-based (valid until a specific date)
l Raw data size based (valid to store up to a specified amount of raw data)
l Both term- and data-size based
l Unlimited duration and data storage
l Node-based with an unlimited number of CPUs and users (one node is a server
acting as a single computer system, whether physical or virtual)
Vertica Community Edition licenses include 1 terabyte data and a limit of 3 nodes.
Vertica for SQL on Hadoop is a separate product with its own license. This
documentation covers both products.
Your license key has your licensing bases encoded into it. If you are unsure of your
current license, you can view your license information from within Vertica.
HPE Vertica Analytics Platform License Types
HPE Vertica Analytics Platform is a full-featured offering with all analytical functions
described in this documentation. It is best used for advanced analytics and enterprise
data warehousing. There are two editions, Community Edition and Premium Edition. To
run Vertica in a Hadoop environment, you purchase a separate Hadoop License.
Vertica Community Edition. You can download and start using Community Edition for
free. The Community Edition license allows customers the following:
l 3 node limit
l 1 terabyte data limit
Community Edition licenses cannot be installed co-located in a Hadoop infrastructure
and used to query data stored in Hadoop formats.
Vertica Premium Edition. You can purchase the Premium Edition license. The
Premium Edition license entitles customers to:
l No node limit
l Data amount as specified by the license

l Query data stored in HDFS using the HCatalog Connector or HDFS Connector, and
back up Vertica data to HDFS
Premium Edition licenses cannot be installed co-located in a Hadoop infrastructure and
used to query data stored in Hadoop formats.
Note: Vertica does not support license downgrades.
Vertica for SQL on Hadoop License
Vertica for SQL on Hadoop is a license for running Vertica on a Hadoop environment.
This allows users to run Vertica on data that is in a shared storage environment. It is
best used for exploring data in a Hadoop data lake. It can be used only in co-located
Hadoop environments to query data stored in Hadoop (Hortonworks, MapR, or
Cloudera).
Customers can purchase this term-based SQL on Hadoop license per the number of
nodes they plan to use in their Hadoop environment. The license then audits the
number of nodes being used for compliance.
Installing or Upgrading a License Key
The steps you follow to apply your Vertica license key vary, depending on the type of
license you are applying and whether you are upgrading your license. This section
describes the following:
l New Vertica License Installations
l Vertica License Renewals or Upgrades
New Vertica License Installations
1. Copy the license key file to your Administration Host.
2. Ensure the license key's file permissions are set to 400 (read permissions).
3. Install Vertica as described in the Installing Vertica if you have not already done so.
The interface prompts you for the license key file.
4. To install Community Edition, leave the default path blank and click OK. To apply

your evaluation or Premium Edition license, enter the absolute path of the license
key file you downloaded to your Administration Host and press OK. The first time
you log in as the Database Administrator and run the Administration Tools, the
interface prompts you to accept the End-User License Agreement (EULA).
Note: If you installed Management Console, the MC administrator can point to
the location of the license key during Management Console configuration.
5. Choose View EULA.
6. Exit the EULA and choose Accept EULA to officially accept the EULA and continue
installing the license, or choose Reject EULA to reject the EULA and return to the
Advanced Menu.
Vertica License Renewals or Upgrades
If your license is expiring or you want your database to grow beyond your licensed data
size, you must renew or upgrade your license. Once you have obtained your renewal or
upgraded license key file, you can install it using Administration Tools or Management
Console.
Uploading or Upgrading a License Key Using Administration Tools
1. Copy the license key file to your Administration Host.
2. Ensure the license key's file permissions are set to 400 (read permissions).
3. Start your database, if it is not already running.
4. In the Administration Tools, select Advanced > Upgrade License Key and click OK.
5. Enter the absolute path to your new license key file and click OK. The interface
prompts you to accept the End-User License Agreement (EULA).
6. Choose View EULA.
7. Exit the EULA and choose Accept EULA to officially accept the EULA and continue
installing the license, or choose Reject EULA to reject the EULA and return to the
Advanced Tools menu.

Uploading or Upgrading a License Key Using Management Console
1. From your database's Overview page in Management Console, click the License
tab. The License page displays. You can view your installed licenses on this page.
2. Click Install New License at the top of the License page.
3. Browse to the location of the license key from your local computer and upload the
file.
4. Click Apply at the top of the page. Management Console prompts you to accept the
End-User License Agreement (EULA).
5. Select the check box to officially accept the EULA and continue installing the
license, or click Cancel to exit.
Note: As soon as you renew or upgrade your license key from either your
Administration Host or Management Console, Vertica applies the license update. No
further warnings appear.
Viewing Your License Status
You can use several functions to display your license terms and current status.
Examining Your License Key
Use the DISPLAY_LICENSE SQL function described in the SQL Reference Manual to
display the license information. This function displays the dates for which your license is
valid (or Perpetual if your license does not expire) and any raw data allowance. For
example:
=> SELECT DISPLAY_LICENSE();
DISPLAY_LICENSE
----------------------------------------------------
Vertica Systems, Inc.
1/1/2011
12/31/2011
30
50TB
(1 row)
You can also query the LICENSES system table to view information about your installed
licenses. This table displays your license types, the dates for which your licenses are
valid, and the size and node limits your licenses impose.

Alternatively, use the LICENSES table in Management Console. On your database
Overview page, click the License tab to view information about your installed licenses.
Viewing Your License Compliance
If your license includes a raw data size allowance, Vertica periodically audits your
database's size to ensure it remains compliant with the license agreement. If your
license has a term limit, Vertica also periodically checks to see if the license has
expired. You can see the result of the latest audits using the GET_COMPLIANCE_
STATUS function.
=> select GET_COMPLIANCE_STATUS();
GET_COMPLIANCE_STATUS
---------------------------------------------------------------------------------
Raw Data Size: 2.00GB +/- 0.003GB
License Size : 4.000GB
Utilization : 50%
Audit Time : 2011-03-09 09:54:09.538704+00
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 04/06/2011
Days Remaining: 28.59
(1 row)
Viewing Your License Status Through MC
Information about license usage is on the Settings page. See Monitoring Database Size
for License Compliance.
Calculating the Database Size
You can use your Vertica software until your columnar data reaches the maximum raw
data size that the license agreement provides. This section describes when data is
monitored, what data is included in the estimate, and the general methodology used to
produce an estimate. For more information about monitoring for data size, see
Monitoring Database Size for License Compliance.
How Vertica Estimates Raw Data Size
Vertica uses statistical sampling to calculate an accurate estimate of the raw data size of
the database. In this context, raw data means the uncompressed data stored in a single
Vertica database. For the purpose of license size audit and enforcement, Vertica
evaluates the raw data size as if the data had been exported from the database in text
format, rather than as compressed data.
Vertica conducts your database size audit using statistical sampling. This method
allows Vertica to estimate the size of the database without significantly impacting

database performance. The trade-off between accuracy and impact on performance is a
small margin of error, inherent in statistical sampling. Reports on your database size
include the margin of error, so you can assess the accuracy of the estimate. To learn
more about simple random sampling, see Simple Random Sampling.
Excluding Data From Raw Data Size Estimate
Not all data in the Vertica database is evaluated as part of the raw data size.
Specifically, Vertica excludes the following data:
l Multiple projections (underlying physical copies) of data from a logical database
entity (table). Data appearing in multiple projections of the same table is counted only
once.
l Data stored in temporary tables.
l Data accessible through external table definitions.
l Data that has been deleted, but that remains in the database. To understand more
about deleting and purging data, see Purging Deleted Data.
l Data stored in the WOS.
l Data stored in system and work tables such as monitoring tables, Data Collector
tables, and Database Designer tables.
l Delimiter characters.
Evaluating Data Type Footprint Size
Vertica treats the data sampled for the estimate as if it had been exported from the
database in text format (such as printed from vsql). This means that Vertica evaluates
the data type footprint sizes as follows:
l Strings and binary types (CHAR, VARCHAR, BINARY, VARBINARY) are counted
as their actual size in bytes using UTF-8 encoding.
l Numeric data types are counted as if they had been printed. Each digit counts as a
byte, as does any decimal point, sign, or scientific notation. For example, -123.456
counts as eight bytes (six digits plus the decimal point and minus sign).

l Date/time data types are counted as if they had been converted to text, including any
hyphens or other separators. For example, a timestamp column containing the value
for noon on July 4th, 2011 would be 19 bytes. As text, vsql would print the value as
2011-07-04 12:00:00, which is 19 characters, including the space between the date
and the time.
Using AUDIT to Estimate Database Size
To supply a more accurate database size estimate than statistical sampling can provide,
use the AUDIT function to perform a full audit. This function has parameters to set both
the error_tolerance and confidence_level. Using one or both of these parameters
increases or decreases the function's performance impact.
For instance, lowering the error_tolerance to zero (0) and raising the confidence_
level to 100, provides the most accurate size estimate, and increases the performance
impact of calling the AUDIT function. During a detailed, low error-tolerant audit, all of the
data in the database is dumped to a raw format to calculate its size. Since performing a
stringent audit can significantly impact database performance, never perform a full audit
of a production database. See AUDIT for details.
Note: Unlike estimating raw data size using statistical sampling, a full audit performs
SQL queries on the full database contents, including the contents of the WOS.
Monitoring Database Size for License Compliance
Your Vertica license can include a data storage allowance. The allowance can consist
of data in columnar tables, flex tables, or both types of data. The AUDIT() function
estimates the columnar table data size and any flex table materialized columns. The
AUDIT_FLEX() function estimates the amount of __raw__ column data in flex or
columnar tables. In regards to license data limits, data in __raw__ columns is calculated
at 1/10th the size of structured data. Monitoring data sizes for columnar and flex tables
lets you plan either to schedule deleting old data to keep your database in compliance
with your license, or to consider a license upgrade for additional data storage.
Note: An audit of columnar data includes flex table real and materialized columns,
but not __raw__ column data.

Viewing Your License Compliance Status
Vertica periodically runs an audit of the columnar data size to verify that your database
is compliant with your license terms. You can view the results of the most recent audit by
calling the GET_COMPLIANCE_STATUS function.
=> select GET_COMPLIANCE_STATUS();
GET_COMPLIANCE_STATUS
---------------------------------------------------------------------------------
Raw Data Size: 2.00GB +/- 0.003GB
License Size : 4.000GB
Utilization : 50%
Audit Time : 2011-03-09 09:54:09.538704+00
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 04/06/2011
Days Remaining: 28.59
(1 row)
Periodically running GET_COMPLIANCE_STATUS to monitor your database's license
status is usually enough to ensure that your database remains compliant with your
license. If your database begins to near its columnar data allowance, you can use the
other auditing functions described below to determine where your database is growing
and how recent deletes affect the database size.
Manually Auditing Columnar Data Usage
You can manually check license compliance for all columnar data in your database
using the AUDIT_LICENSE_SIZE function. This function performs the same audit that
Vertica periodically performs automatically. The AUDIT_LICENSE_SIZE check runs in
the background, so the function returns immediately. You can then query the results
using GET_COMPLIANCE_STATUS.
Note: When you audit columnar data, the results include any flex table real and
materialized columns, but not data in the __raw__ column. Materialized columns are
virtual columns that you have promoted to real columns. Columns that you define
when creating a flex table, or which you add with ALTER TABLE...ADD COLUMN
statements are real columns. All __raw__ columns are real columns. However,
since they consist of unstructured or semi-structured data, they are audited
separately.
An alternative to AUDIT_LICENSE_SIZE is to use the AUDIT function to audit the size
of the columnar tables in your entire database by passing an empty string to the

function. This function operates synchronously, returning when it has estimated the size
of the database.
=> SELECT AUDIT('');
AUDIT
----------
76376696
(1 row)
The size of the database is reported in bytes. The AUDIT function also allows you to
control the accuracy of the estimated database size using additional parameters. See
the entry for the AUDIT function in the SQL Reference Manual for full details. Vertica
does not count the AUDIT function results as an official audit. It takes no license
compliance actions based on the results.
Note: The results of the AUDIT function do not include flex table data in __raw__
columns. Use the AUDIT_FLEX function to monitor data usage flex tables.
Manually Auditing __raw__ Column Data
You can use the AUDIT_FLEX function to manually audit data usage for flex or
columnar tables with a __raw__ column. The function calculates the encoded,
compressed data stored in ROS containers for any __raw__ columns. Materialized
columns in flex tables are calculated by the AUDIT function. The AUDIT_FLEX results
do not include data in the __raw__ columns of temporary flex tables.
Targeted Auditing
If audits determine that the columnar table estimates are unexpectedly large, consider
schemas, tables, or partitions that are using the most storage. You can use the AUDIT
function to perform targeted audits of schemas, tables, or partitions by supplying the
name of the entity whose size you want to find. For example, to find the size of the
online_sales schema in the VMart example database, run the following command:
VMart=> SELECT AUDIT('online_sales');
AUDIT
----------
35716504
(1 row)
You can also change the granularity of an audit to report the size of each object in a
larger entity (for example, each table in a schema) by using the granularity argument of
the AUDIT function. See the AUDIT function in the SQL Reference Manual.

Using Management Console to Monitor License Compliance
You can also get information about data storage of columnar data (for columnar tables
and for materialized columns in flex tables) through the Management Console. This
information is available in the database Overview page, which displays a grid view of
the database's overall health.
l The needle in the license meter adjusts to reflect the amount used in megabytes.
l The grace period represents the term portion of the license.
l The Audit button returns the same information as the AUDIT() function in a graphical
representation.
l The Details link within the License grid (next to the Audit button) provides historical
information about license usage. This page also shows a progress meter of percent
used toward your license limit.
Managing License Warnings and Limits
Term License Warnings and Expiration
The term portion of an Vertica license is easy to manage—you are licensed to use
Vertica until a specific date. If the term of your license expires, Vertica alerts you with
messages appearing in the Administration Tools and vsql. For example:
=> CREATE TABLE T (A INT);NOTICE: Vertica license is in its grace period
HINT: Renew at http://www.vertica.com/
CREATE TABLE
Contact Vertica at http://www.vertica.com/about/contact-us/ as soon as possible to
renew your license, and then install the new license. After the grace period expires,
Vertica stops processing queries.
Data Size License Warnings and Remedies
If your Vertica columnar license includes a raw data size allowance, Vertica periodically
audits the size of your database to ensure it remains compliant with the license
agreement. For details of this audit, see Calculating the Database Size. You should also
monitor your database size to know when it will approach licensed usage. Monitoring
the database size helps you plan to either upgrade your license to allow for continued
database growth or delete data from the database so you remain compliant with your
license. See Monitoring Database Size for License Compliance for details.

If your database's size approaches your licensed usage allowance (above 75% of
license limits), you will see warnings in the Administration Tools , vsql, and
Management Console. You have two options to eliminate these warnings:
l Upgrade your license to a larger data size allowance.
l Delete data from your database to remain under your licensed raw data size
allowance. The warnings disappear after Vertica's next audit of the database size
shows that it is no longer close to or over the licensed amount. You can also
manually run a database audit (see Monitoring Database Size for License
Compliance for details).
If your database continues to grow after you receive warnings that its size is
approaching your licensed size allowance, Vertica displays additional warnings in more
parts of the system after a grace period passes.
If Your Vertica Premium Edition Database Size Exceeds Your
Licensed Limits
If your Premium Edition database size exceeds your licensed data allowance, all
successful queries from ODBC and JDBC clients return with a status of SUCCESS_
WITH_INFO instead of the usual SUCCESS. The message sent with the results
contains a warning about the database size. Your ODBC and JDBC clients should be
prepared to handle these messages instead of assuming that successful requests
always return SUCCESS.
Note: These warnings for Premium Edition are in addition to any warnings you see
in Administration Tools, vsql, and Management Console.
If Your VerticaCommunity Edition Database Size Exceeds 1
Terabyte
If your Community Edition database size exceeds the limit of 1 terabyte, you will no
longer be able to load or modify data in your database. In addition, you will not be able
to delete data from your database.
To bring your database under compliance, you can choose to:
l Drop database tables. You can also consider truncating a table or dropping a
partition. See TRUNCATE TABLE or DROP_PARTITION.
l Upgrade to Vertica Premium Edition (or an evaluation license)

Exporting License Audit Results to CSV
You can use admintools to audit a database for license compliance and export the
results in CSV format, as follows:
admintools -t license_audit [--password=password] --database=database] [--file=csv-file] [--quiet]
where:
l database must be a running database. If the database is password protected, you
must also supply the password.
l --file csv-file directs output to the specified file. If csv-file already exists, the
tool returns an error message. If this option is unspecified, output is directed to
stdout.
l --quiet specifies that the tool should run in quiet mode; if unspecified, status
messages are sent to stdout.
Running the license_audit tool is equivalent to invoking the following SQL
statements:
select audit('');
select audit_flex('');
select * from dc_features_used;
select * from vcatalog.license_audits;
select * from vcatalog.user_audits;
Audit results include the following information:
l Log of used Vertica features
l Estimated database size
l Raw data size allowed by your Vertica license
l Percentage of licensed allowance that the database currently uses
l Audit timestamps
The following truncated example shows the raw CSV output that license_audit
generates:
FEATURES_USED

features_used,feature,date,sum
features_used,metafunction::get_compliance_status,2014-08-04,1
features_used,metafunction::bootstrap_license,2014-08-04,1
...
LICENSE_AUDITS
license_audits,database_size_bytes,license_size_bytes,usage_percent,audit_start_timestamp,audit_
end_timestamp,confidence_level_percent,error_tolerance_percent,used_sampling,confidence_interval_
lower_bound_bytes,confidence_interval_upper_bound_bytes,sample_count,cell_count,license_name
license_audits,808117909,536870912000,0.00150523690320551,2014-08-04 23:59:00.024874-04,2014-08-04
23:59:00.578419-04,99,5,t,785472097,830763721,10000,174754646,vertica
...
USER_AUDITS
user_audits,size_bytes,user_id,user_name,object_id,object_type,object_schema,object_name,audit_
start_timestamp,audit_end_timestamp,confidence_level_percent,error_tolerance_percent,used_
sampling,confidence_interval_lower_bound_bytes,confidence_interval_upper_bound_bytes,sample_
count,cell_count
user_audits,812489249,45035996273704962,dbadmin,45035996273704974,DATABASE,,VMart,2014-10-14
11:50:13.230669-04,2014-10-14 11:50:14.069057-04,99,5,t,789022736,835955762,10000,174755178
AUDIT_SIZE_BYTES
audit_size_bytes,now,audit
audit_size_bytes,2014-10-14 11:52:14.015231-04,810584417
FLEX_SIZE_BYTES
flex_size_bytes,now,audit_flex
flex_size_bytes,2014-10-14 11:52:15.117036-04,11850

Configuring the Database
This section provides information about:
l The Configuration Procedure
l Designing a logical schema
l Creating the physical schema
You'll also want to set up a security scheme. See Implementing Security.
See also implementing locales for international data sets.
Note: Before you begin this section, HPE strongly recommends that you follow the
Tutorial in Getting Started to quickly familiarize yourself with creating and
configuring a fully-functioning example database.

Configuration Procedure
This section describes the tasks required to set up an Vertica database. It assumes that
you have obtained a valid license key file, installed the Vertica rpm package, and run
the installation script as described in Installing Vertica.
You'll complete the configuration procedure using:
l Administration Tools
If you are unfamiliar with Dialog-based user interfaces, read Using the Administration
Tools Interface before you begin. See also the Administration Tools Reference for
details.
l vsql interactive interface
l Database Designer, described in Creating a Database Design
Note: You can also perform certain tasks using Management Console. Those tasks
point to the appropriate topic.
l Follow the configuration procedure in the order presented in this book.
l HPE strongly recommends that you first use the Tutorial in Getting Started to
experiment with creating and configuring a database.
l Although you may create more than one database (for example, one for production
and one for testing), you may create only one active database for each installation of
Vertica Analytics Platform
l The generic configuration procedure described here can be used several times
during the development process and modified each time to fit changing goals. You
can omit steps such as preparing actual data files and sample queries, and run the
Database Designer without optimizing for queries. For example, you can create, load,
and query a database several times for development and testing purposes, then one
final time to create and load the production database.

Prepare Disk Storage Locations
You must create and specify directories in which to store your catalog and data files
(physical schema). You can specify these locations when you install or configure the
database, or later during database operations. Both the catalog and data directories
must be owned by the database administrator.
The directory you specify for database catalog files (the catalog path) is used across all
nodes in the cluster. For example, if you specify /home/catalog as the catalog directory,
Vertica uses that catalog path on all nodes. The catalog directory should always be
separate from any data file directories.
The data path you designate is also used across all nodes in the cluster. Specifying that
data should be stored in /home/data, Vertica uses this path on all database nodes.
Do not use a single directory to contain both catalog and data files. You can store the
catalog and data directories on different drives, which can be either on drives local to
the host (recommended for the catalog directory) or on a shared storage location, such
as an external disk enclosure or a SAN.
Before you specify a catalog or data path, be sure the parent directory exists on all
nodes of your database. Creating a database in admintools also creates the catalog and
data directories, but the parent directory must exist on each node.
You do not need to specify a disk storage location during installation. However, you can
do so by using the --data-dir parameter to the install_vertica script. See
See Also
l Specifying Disk Storage Location on MC
l Specifying Disk Storage Location During Database Creation
l Configuring Disk Usage to Optimize Performance
l Using Shared Storage With Vertica

There are three ways to specify the disk storage location. You can specify the location
when you:
l Install Vertica
l Create a database using the Administration Tools
l Install and configure Management Console
To Specify the Disk Storage Location When You install:
When you install Vertica, the --data_dir parameter in the install_vertica script
(see Installing Vertica with the install_vertica Script) lets you specify a directory to
contain database data and catalog files. The script defaults to the Database
Administrator's default home directory: /home/dbadmin.
You should replace this default with a directory that has adequate space to hold your
data and catalog files.
Before you create a database, verify that the data and catalog directory exists on each
node in the cluster. Also verify that the directory on each node is owned by the database
administrator.
Notes
l Catalog and data path names must contain only alphanumeric characters and cannot
have leading space characters. Failure to comply with these restrictions will result in
database creation failure.
l Vertica refuses to overwrite a directory if it appears to be in use by another database.
Therefore, if you created a database for evaluation purposes, dropped the database,
and want to reuse the database name, make sure that the disk storage location
previously used has been completely cleaned up. See Managing Storage Locations
for details.
Specifying Disk Storage Location During Database Creation
When you invoke the Create Database command in the Administration Tools, a dialog
box allows you to specify the catalog and data locations. These locations must exist on
each host in the cluster and must be owned by the database administrator.

When you click OK, Vertica automatically creates the following subdirectories:
catalog-pathname/database-name/node-name_catalog/data-pathname/database-name/node-name_data/
For example, if you use the default value (the database administrator's home directory)
of /home/dbadmin for the Stock Exchange example database, the catalog and data
directories are created on each node in the cluster as follows:
/home/dbadmin/Stock_Schema/stock_schema_node1_host01_catalog/home/dbadmin/Stock_Schema/stock_
schema_node1_host01_data
Notes
l Catalog and data path names must contain only alphanumeric characters and cannot
have leading space characters. Failure to comply with these restrictions will result in
l Vertica refuses to overwrite a directory if it appears to be in use by another database.
Therefore, if you created a database for evaluation purposes, dropped the database,
and want to reuse the database name, make sure that the disk storage location
previously used has been completely cleaned up. See Managing Storage Locations
for details.
Specifying Disk Storage Location on MC
You can use the MC interface to specify where you want to store database metadata on
the cluster in the following ways:
l When you configure MC the first time
l When you create new databases using on MC
See Configuring Management Console.

Configuring Disk Usage to Optimize Performance
Once you have created your initial storage location, you can add additional storage
locations to the database later. Not only does this provide additional space, it lets you
control disk usage and increase I/O performance by isolating files that have different I/O
or access patterns. For example, consider:
l Isolating execution engine temporary files from data files by creating a separate
storage location for temp space.
l Creating labeled storage locations and storage policies, in which selected database
objects are stored on different storage locations based on measured performance
statistics or predicted access patterns.
See Managing Storage Locations for details.
Using Shared Storage With Vertica
If using shared SAN storage, ensure there is no contention among the nodes for disk
space or bandwidth.
l Each host must have its own catalog and data locations. Hosts cannot share catalog
or data locations.
l Configure the storage so that there is enough I/O bandwidth for each node to access
the storage independently.
Viewing Database Storage Information
You can view node-specific information on your Vertica cluster through the Management
Console. See Monitoring Vertica Using MC for details.
Disk Space Requirements for Vertica
In addition to actual data stored in the database, Vertica requires disk space for several
data reorganization operations, such as mergeout and managing nodes in the cluster.
For best results, HPE recommends that disk utilization per node be no more than sixty
percent (60%) for a K-Safe=1 database to allow such operations to proceed.
In addition, disk space is temporarily required by certain query execution operators,
such as hash joins and sorts, in the case when they cannot be completed in memory
(RAM). Such operators might be encountered during queries, recovery, refreshing
projections, and so on. The amount of disk space needed (known as temp space)
depends on the nature of the queries, amount of data on the node and number of

concurrent users on the system. By default, any unused disk space on the data disk can
be used as temp space. However, HPE recommends provisioning temp space separate
from data disk space. See Configuring Disk Usage to Optimize Performance.
Disk Space Requirements for Management Console
You can install MC on any node in the cluster, so there are no special disk requirements
for MC—other than disk space you would normally allocate for your database cluster.
See Disk Space Requirements for Vertica.
Prepare the Logical Schema Script
Designing a logical schema for an Vertica database is no different from designing one
for any other SQL database. Details are described more fully in Designing a Logical
Schema.
To create your logical schema, prepare a SQL script (plain text file, typically with an
extension of .sql) that:
1. Creates additional schemas (as necessary). See Using Multiple Schemas.
2. Creates the tables and column constraints in your database using the CREATE
TABLE command.
3. Defines the necessary table constraints using the ALTER TABLE command.
4. Defines any views on the table using the CREATE VIEW command.
You can generate a script file using:
l A schema designer application.
l A schema extracted from an existing database.
l A text editor.
l One of the example database example-name_define_schema.sql scripts as a
template. (See the example database directories in /opt/vertica/examples.)
In your script file, make sure that:

l Each statement ends with a semicolon.
l You use data types supported by Vertica, as described in the SQL Reference
Manual.
Once you have created a database, you can test your schema script by executing it as
described in Create the Logical Schema. If you encounter errors, drop all tables, correct
the errors, and run the script again.
Prepare Data Files
Prepare two sets of data files:
l Test data files. Use test files to test the database after the partial data load. If
possible, use part of the actual data files to prepare the test data files.
l Actual data files. Once the database has been tested and optimized, use your data
files for your initial Bulk Loading Data.
How to Name Data Files
Name each data file to match the corresponding table in the logical schema. Case does
not matter.
Use the extension .tbl or whatever you prefer. For example, if a table is named
Stock_Dimension, name the corresponding data file stock_dimension.tbl. When
using multiple data files, append _nnn (where nnn is a positive integer in the range 001
to 999) to the file name. For example, stock_dimension.tbl_001, stock_
dimension.tbl_002, and so on.
Prepare Load Scripts
Note: You can postpone this step if your goal is to test a logical schema design for
validity.
Prepare SQL scripts to load data directly into physical storage using the
COPY...DIRECT statement using vsql, or through ODBC as described in Connecting to
Vertica.
You need scripts that load the:

l Large tables
l Small tables
HPE recommends that you load large tables using multiple files. To test the load
process, use files of 10GB to 50GB in size. This size provides several advantages:
l You can use one of the data files as a sample data file for the Database Designer.
l You can load just enough data to Perform a Partial Data Load before you load the
remainder.
l If a single load fails and rolls back, you do not lose an excessive amount of time.
l Once the load process is tested, for multi-terabyte tables, break up the full load in file
sizes of 250–500GB.
See Bulk Loading Data and the following additional topics for details:
l Bulk Loading Data
l Using Load Scripts
l Using Parallel Load Streams
l Loading Data into Pre-Join Projections
l Enforcing Constraints
l About Load Errors
Tip: You can use the load scripts included in the example databases in Getting
Started as templates.
Create an Optional Sample Query Script
The purpose of a sample query script is to test your schema and load scripts for errors.
Include a sample of queries your users are likely to run against the database. If you don't
have any real queries, just write simple SQL that collects counts on each of your tables.
Alternatively, you can skip this step.

Create an Empty Database
Two options are available for creating an empty database:
l Using the Management Console
l Using Administration Tools
Although you can create more than one database (for example, one for production and
one for testing), there can be only one active database for each installation of Vertica
Analytics Platform.
Creating a Database Name and Password
Database names must conform to the following rules:
l Be between 1-30 characters
l Begin with a letter
l Follow with any combination of letters (upper and lowercase), numbers, and/or
underscores.
Database names are case sensitive; however, HPE strongly recommends that you do
not create databases with names that differ only in case. For example, do not create a
database called mydatabase and another called MyDataBase.
Database Passwords
Database passwords can contain letters, digits, and special characters listed in the next
table.
Passwords cannot include space characters, or any non-ASCII Unicode characters. The
length of a database password must be from 8 - 100 characters.
You use Profiles to specify and control password definitions. For instance, a profile can
define the maximum length, reuse time, and the minimum number or required digits for a
password, as well as other details. You can also change password definitions using
ALTER PROFILE.
The following table lists special (ASCII) characters that Vertica permits in database
passwords. Special characters can appear anywhere within a password string; for
example, mypas$word or $mypassword or mypassword$ are all permitted.
Caution: Using special characters in database passwords that are not listed in the

following table could cause database instability.
Character Description
#
pound sign
!
exclamation point
+
plus sign
*
asterisk
?
question mark
,
comma
.
period
/
forward slash
=
equals sign
~
tilde
-
minus sign
$
dollar sign
_
underscore
:
colon
space
"
double quote
'
single quote
%
percent sign
&
ampersand
(
parenthesis

)
parenthesis
;
semicolon
<
less than sign
>
greater than sign
@
at sign
`
back quote
[
square bracket
]
square bracket

backslash
^
caret
|
vertical bar
{
curly bracket
}
curly bracket
See Also
l Password Guidelines
l ALTER PROFILE
l CREATE PROFILE
l DROP PROFILE
Create an Empty Database Using MC
You can create a new database on an existing Vertica cluster through the Management
Console interface.
Database creation can be a long-running process, lasting from minutes to hours,
depending on the size of the target database. You can close the web browser during the
process and sign back in to MC later; the creation process continues unless an
unexpected error occurs. See the Notes section below the procedure on this page.

You currently need to use command line scripts to define the database schema and
load data. Refer to the topics in Configuration Procedure. You should also run the
Database Designer, which you access through the Administration Tools, to create either
a comprehensive or incremental design. Consider using the Tutorial in Getting Started
to create a sample database you can start monitoring immediately.
How to Create an Empty Database on an MC-managed Cluster
1. If you are already on the Databases and Clusters page, skip to the next step;
otherwise:
a. Connect to MC and sign in as an MC administrator.
b. On the Home page, click Existing Infrastructure to view the Databases and
Clusters page.
2. If no databases exist on the cluster, continue to the next step; otherwise:
a. If a database is running on the cluster on which you want to add a new database,
select the database and click Stop.
b. Wait for the running database to have a status of Stopped.
3. Click the cluster on which you want to create the new database and click Create
Database.
4. The Create Database wizard opens. Provide the following information:
n Database name and password. See Creating a Database Name and Password
for rules.
n Optionally click Advanced to open the advanced settings and change the port
and catalog, data, and temporary data paths. By default the MC application/web
server port is 5450 and paths are /home/dbadmin, or whatever you defined for
the paths when you ran the Cluster Creation Wizard or the install_vertica
script. Do not use the default agent port 5444 as a new setting for the MC port.
See MC Settings > Configuration for port values.
5. Click Continue.

6. Select nodes to include in the database.
The Database Configuration window opens with the options you provided and a
graphical representation of the nodes appears on the page. By default, all nodes are
selected to be part of this database (denoted by a green check mark). You can
optionally click each node and clear Include host in new database to exclude that
node from the database. Excluded nodes are gray. If you change your mind, click
the node and select the Include check box.
7. Click Create in the Database Configuration window to create the database on the
nodes.
The creation process takes a few moments, after which the database starts and a
Success message appears on the interface.
8. Click OK to close the success message.
The Manage page opens and displays the database nodes. Nodes not included in the
database are colored gray, which means they are standby nodes you can include later.
To add nodes to or remove nodes from your Vertica cluster, which are not shown in
standby mode, you must run the install_vertica script.
Notes
l If warnings occur during database creation, nodes will be marked on the UI with an
Alert icon and a message.
n Warnings do not prevent the database from being created, but you should address
warnings after the database creation process completes by viewing the database
Message Center from the MC Home page.
n Failure messages display on the database Manage page with a link to more
detailed information and a hint with an actionable task that you must complete
before you can continue. Problem nodes are colored red for quick identification.
n To view more detailed information about a node in the cluster, double-click the
node from the Manage page, which opens the Node Details page.
l To create MC users and grant them access to an MC-managed database, see About
MC Users and Creating an MC User.

See Also
l Creating a Cluster Using MC
l Troubleshooting with MC Diagnostics
l Restarting MC
Create a Database Using Administration Tools
1. Run the Administration Tools from your Administration Host as follows:
If you are using a remote terminal application, such as PuTTY or a Cygwin bash
shell, see Notes for Remote Terminal Users.
2. Accept the license agreement and specify the location of your license file. For more
information see Managing Licenses for more information.
This step is necessary only if it is the first time you have run the Administration Tools
3. On the Main Menu, click Configuration Menu, and click OK.
4. On the Configuration Menu, click Create Database, and click OK.
5. Enter the name of the database and an optional comment, and click OK. See
Creating a Database Name and Password for naming guidelines and restrictions.
6. Establish the superuser password for your database.
n To provide a password enter the password and click OK. Confirm the password
by entering it again, and then click OK.
n If you don't want to provide the password, leave it blank and click OK. If you don't
set a password, Vertica prompts you to verify that you truly do not want to
establish a superuser password for this database. Click Yes to create the
database without a password or No to establish the password.
Caution: If you do not enter a password at this point, the superuser password is
set to empty. Unless the database is for evaluation or academic purposes, HPE

strongly recommends that you enter a superuser password. See Creating a
Database Name and Password for guidelines.
7. Select the hosts to include in the database from the list of hosts specified when
Vertica was installed (install_vertica -s), and click OK.
8. Specify the directories in which to store the data and catalog files, and click OK.
9. Catalog and data path names must contain only alphanumeric characters and
cannot have leading spaces. Failure to comply with these restrictions results in
For example:
Catalog pathname: /home/dbadmin
Data Pathname: /home/dbadmin
10. Review the Current Database Definition screen to verify that it represents the
database you want to create, and then click Yes to proceed or No to modify the
database definition.
11. If you click Yes, Vertica creates the database you defined and then displays a
message to indicate that the database was successfully created.
Note: : For databases created with 3 or more nodes, Vertica automatically sets
K-safety to 1 to ensure that the database is fault tolerant in case a node fails. For
more information, see Failure Recovery in the Administrator's Guide and
MARK_DESIGN_KSAFE
12. Click OK to acknowledge the message.

Create the Logical Schema
1. Connect to the database.
In the Administration Tools Main Menu, click Connect to Database and click OK.
See Connecting to the Database for details.
The vsql welcome script appears:
Welcome to vsql, the Vertica Analytic Database interactive terminal.
Type: h or ? for help with vsql commands
g or terminate with semicolon to execute query
q to quit
=>
2. Run the logical schema script
Using the i meta-command in vsql to run the SQL logical schema script that you
prepared earlier.
3. Disconnect from the database
Use the q meta-command in vsql to return to the Administration Tools.
Perform a Partial Data Load
HPE recommends that for large tables, you perform a partial data load and then test
your database before completing a full data load. This load should load a representative
amount of data.

1. Load the small tables.
Load the small table data files using the SQL load scripts and data files you
prepared earlier.
2. Partially load the large tables.
Load 10GB to 50GB of table data for each table using the SQL load scripts and data
files that you prepared earlier.
For more information about projections, see Physical Schema in Vertica Concepts.
Test the Database
Test the database to verify that it is running as expected.
Check queries for syntax errors and execution times.
1. Use the vsql timing meta-command to enable the display of query execution time
in milliseconds.
2. Execute the SQL sample query script that you prepared earlier.
3. Execute several ad hoc queries.
Optimize Query Performance
Optimizing the database consists of optimizing for compression and tuning for queries.
(See Creating a Database Design.)
To optimize the database, use the Database Designer to create and deploy a design for
optimizing the database. See the Tutorial in Getting Started for an example of using the
Database Designer to create a Comprehensive Design.
After you have run the Database Designer, use the techniques described in Optimizing
Query Performance in Analyzing Data to improve the performance of certain types of
queries.
Note: The database response time depends on factors such as type and size of the
application query, database design, data size and data types stored, available
computational power, and network bandwidth. Adding nodes to a database cluster
does not necessarily improve the system response time for every query, especially if
the response time is already short, e.g., less then 10 seconds, or the response time

is not hardware bound.
Complete the Data Load
To complete the load:
1. Monitor system resource usage
Continue to run the top, free, and df utilities and watch them while your load
scripts are running (as described in Monitoring Linux Resource Usage). You can do
this on any or all nodes in the cluster. Make sure that the system is not swapping
excessively (watch kswapd in top) or running out of swap space (watch for a large
amount of used swap space in free).
Note: Vertica requires a dedicated server. If your loader or other processes take
up significant amounts of RAM, it can result in swapping.
2. Complete the large table loads
Run the remainder of the large table load scripts.
Test the Optimized Database
Check query execution times to test your optimized design:
1. Use the vsql timing meta-command to enable the display of query execution
time in milliseconds.
Execute a SQL sample query script to test your schema and load scripts for errors.
Note: Include a sample of queries your users are likely to run against the
database. If you don't have any real queries, just write simple SQL that collects
counts on each of your tables. Alternatively, you can skip this step.
2. Execute several ad hoc queries
a. Run Administration Tools and select Connect to Database.
b. Use the i meta-command to execute the query script; for example:
vmartdb=> i vmart_query_03.sql customer_name | annual_income

------------------+---------------
James M. McNulty | 999979
Emily G. Vogel | 999998
(2 rows)
Time: First fetch (2 rows): 58.411 ms. All rows formatted: 58.448 ms
vmartdb=> i vmart_query_06.sql
-----------+--------------+--------------
45 | 202416 | 2004-01-04
113 | 66017 | 2004-01-04
121 | 251417 | 2004-01-04
24 | 250295 | 2004-01-04
9 | 188567 | 2004-01-04
166 | 36008 | 2004-01-04
27 | 150241 | 2004-01-04
148 | 182207 | 2004-01-04
198 | 75716 | 2004-01-04
(9 rows)
Time: First fetch (9 rows): 25.342 ms. All rows formatted: 25.383 ms
Once the database is optimized, it should run queries efficiently. If you discover queries
that you want to optimize, you can modify and update the design. See Incremental
Design in the Administrator's Guide.
Set Up Incremental (Trickle) Loads
Once you have a working database, you can use trickle loading to load new data while
concurrent queries are running.
Trickle load is accomplished by using the COPY command (without the DIRECT
keyword) to load 10,000 to 100,000 rows per transaction into the WOS. This allows
Vertica to batch multiple loads when it writes data to disk. While the COPY command
defaults to loading into the WOS, it will write ROS if the WOS is full.
See Trickle Loading Data for details.
See Also
l COPY
l Loading Data Through ODBC

Implement Locales for International Data Sets
The locale is a parameter that defines the user's language, country, and any special
variant preferences, such as collation. Vertica uses the locale to determine the behavior
of certain string functions. The locale also determines the collation for various SQL
commands that require ordering and comparison, such as GROUP BY, ORDER BY,
joins, and the analytic ORDER BY clause.
By default, the locale for your Vertica database is en_US@collation=binary (English
US). You can define a new default locale that is used for all sessions on the database.
You can also override the locale for individual sessions. However, projections are
always collated using the default en_US@collation=binary collation, regardless of
the session collation. Any locale-specific collation is applied at query time.
You can set the locale through ODBC, JDBC, and ADO.net.
ICU Locale Support
Vertica uses the ICU library for locale support; you must specify locale using the ICU
locale syntax. The locale used by the database session is not derived from the
operating system (through the LANG variable), so Hewlett Packard Enterprise
recommends that you set the LANG for each node running vsql, as described in the next
section.
While ICU library services can specify collation, currency, and calendar preferences,
Vertica supports only the collation component. Any keywords not relating to collation are
rejected. Projections are always collated using the en_US@collation=binary collation
regardless of the session collation. Any locale-specific collation is applied at query time.
The SET DATESTYLE TO ... command provides some aspects of the calendar, but
Vertica supports only dollars as currency.
Changing DB Locale for a Session
This examples sets the session locale to Thai.
1. At the operating-system level for each node running vsql, set the LANG variable to
the locale language as follows:
export LANG=th_TH.UTF-8
Note: If setting the LANG= as shown does not work, the operating system support

for locales may not be installed.
2. For each Vertica session (from ODBC/JDBC or vsql) set the language locale.
From vsql:
locale th_TH
3. From ODBC/JDBC:
"SET LOCALE TO th_TH;"
4. In PUTTY (or ssh terminal), change the settings as follows:
settings > window > translation > UTF-8
5. Click Apply and then click Save.
All data loaded must be in UTF-8 format, not an ISO format, as described in Loading
UTF-8 Format Data. Character sets like ISO 8859-1 (Latin1), which are incompatible
with UTF-8, are not supported, so functions like SUBSTRING do not work correctly for
multibyte characters. Thus, settings for locale should not work correctly. If the translation
setting ISO-8859-11:2001 (Latin/Thai) works, the data is loaded incorrectly. To convert
data correctly, use a utility program such as Linux iconv.
Note: The maximum length parameter for VARCHAR and CHAR data type refers to
the number of octets (bytes) that can be stored in that field, not the number of
characters. When using multi-byte UTF-8 characters, make sure to size fields to
accommodate from 1 to 4 bytes per character, depending on the data.
See Also
l Supported Locales
l About Locales
l SET LOCALE
l ICU User Guide

Specify the Default Locale for the Database
After you start the database, the default locale configuration parameter,
DefaultSessionLocale, sets the initial locale. You can override this value for
individual sessions.
To set the locale for the database, use the configuration parameter as follows:
=> ALTER DATABASE mydb SET DefaultSessionLocale = 'ICU-locale-identifier';
For example:
=> ALTER DATABASE mydb SET DefaultSessionLocale = 'en_GB';
Override the Default Locale for a Session
To override the default locale for a specific session, use one of the following commands:
l The vsql command
locale <ICU-locale-identifier>;
For example:
=> locale en_GBINFO:
INFO 2567: Canonical locale: 'en_GB'
Standard collation: 'LEN'
English (United Kingdom)
l The statement SET LOCALE TO <ICU-locale-identifier>.
=> SET LOCALE TO en_GB;
INFO 2567: Canonical locale: 'en_GB'
English (United Kingdom)
You can also use the Short Form of a locale in either of these commands:
=> SET LOCALE TO LEN;
INFO 2567: Canonical locale: 'en'
English
=> locale LEN
INFO 2567: Canonical locale: 'en'
English

You can use these commands to override the locale as many times as needed during a
database session. The session locale setting applies to any subsequent commands
issued in the session.
See Also
l SET LOCALE
Best Practices for Working with Locales
It is important to understand the distinction between the locale settings on the database
server and locale settings at the client application level. The server locale settings
impact only the collation behavior for server-side query processing. The client
application is responsible for verifying that the correct locale is set in order to display the
characters correctly. Hewlett Packard Enterprise recommends the following best
practices to ensure predictable results:
Server Locale
The server session locale should be set as described in Specify the Default Locale for
the Database. If you are using different locales in different sessions, at the start of each
session from your client, set the server locale .
vsql Client
l If thedatabase does not have a default session locale, set the server locale for the
session to the desired locale, as described in Override the Default Locale for a
Session.
l The locale setting in the terminal emulator where the vsql client runs should be set to
be equivalent to session locale setting on the server side (ICU locale). By doing so,
the data is collated correctly on the server and displayed correctly on the client.
l All input data for vsql should be in UTF-8, and all output data is encoded in UTF-8
l Vertica does not support non UTF-8 encodings and associated locale values; .
l For instructions on setting locale and encoding, refer to your terminal emulator
documentation.
ODBC Clients
l ODBC applications can be either in ANSI or Unicode mode. If the user application is
Unicode, the encoding used by ODBC is UCS-2. If the user application is ANSI, the

data must be in single-byte ASCII, which is compatible with UTF-8 used on the
database server. The ODBC driver converts UCS-2 to UTF-8 when passing to the
Vertica server and converts data sent by the Vertica server from UTF-8 to UCS-2.
l If the user application is not already in UCS-2, the application must convert the input
data to UCS-2, or unexpected results could occur. For example:
n For non-UCS-2 data passed to ODBC APIs, when it is interpreted as UCS-2, it
could result in an invalid UCS-2 symbol being passed to the APIs, resulting in
errors.
n The symbol provided in the alternate encoding could be a valid UCS-2 symbol. If
this occurs, incorrect data is inserted into the database.
l If the database does not have a default session locale, ODBC applications should set
the desired server session locale using SQLSetConnectAttr (if different from
database wide setting). By doing so, you get the expected collation and string
functions behavior on the server.
JDBC and ADO.NET Clients
l JDBC and ADO.NET applications use a UTF-16 character set encoding and are
responsible for converting any non-UTF-16 encoded data to UTF-16. The same
cautions apply as for ODBC if this encoding is violated.
l The JDBC and ADO.NET drivers convert UTF-16 data to UTF-8 when passing to the
Vertica server and convert data sent by Vertica server from UTF-8 to UTF-16.
l If there is no default session locale at the database level, JDBC and ADO.NET
applications should set the correct server session locale by executing the SET
LOCALE TO command in order to get the expected collation and string functions
behavior on the server. For more information, see SET LOCALE.
Usage Considerations
Session related:
l The locale setting is session scoped and applies only to queries (no DML/DDL)
executed in that session. You cannot specify a locale for an individual query.

l You can set the default locale for new sessions using the DefaultSessionLocale
configuration parameter
Query related:
The following restrictions apply when queries are run with locale other than the default
en_US@collation=binary:
l When one or more of the left-side NOT IN columns is CHAR or VARCHAR,
multicolumn NOT IN subqueries are not supported . For example:
=> CREATE TABLE test (x VARCHAR(10), y INT);
=> SELECT ... FROM test WHERE (x,y) NOT IN (SELECT ...);
ERROR: Multi-expression NOT IN subquery is not supported because a left hand expression could
be NULL
Note: Even if columns test.x and test.y have a NOT NULL constraint, an
error occurs.
l If the outer query contains a GROUP BY on a CHAR or a VARCHAR column,
correlated HAVING clause subqueries are not supported. In the following example,
the GROUP BY x in the outer query causes the error:
=> DROP TABLE test CASCADE;
=> CREATE TABLE test (x VARCHAR(10));
=> SELECT COUNT(*) FROM test t GROUP BY x HAVING x
IN (SELECT x FROM test WHERE t.x||'a' = test.x||'a' );
ERROR: subquery uses ungrouped column "t.x" from outer query
l Subqueries that use analytic functions in the HAVING clause are not supported. For
example:
=> DROP TABLE test CASCADE;
=> CREATE TABLE test (x VARCHAR(10));
=> SELECT MAX(x)OVER(PARTITION BY 1 ORDER BY 1)
FROM test GROUP BY x HAVING x IN (SELECT MAX(x) FROM test);
ERROR: Analytics query with having clause expression that involves aggregates and subquery
is not supported
DML/DDL related:

l SQL identifiers (such as table names and column names) can use UTF-8 Unicode
characters. For example, the following CREATE TABLE statement uses the ß
(German eszett) in the table name:
=> CREATE TABLE straße(x int, y int);
CREATE TABLE
l Projection sort orders are made according to the default en_US@collation=binary
collation. Thus, regardless of the session setting, issuing the following command
creates a projection sorted by col1 according to the binary collation:
=> CREATE PROJECTION p1 AS SELECT * FROM table1 ORDER BY col1;
In such cases, straße and strasse are not stored near each other on disk.
Sorting by binary collation also means that sort optimizations do not work in locales
other than binary. Vertica returns the following warning if you create tables or
projections in a non-binary locale:
WARNING: Projections are always created and persisted in the default Vertica locale. The
current locale is de_DE
l When creating pre-join projections, the projection definition query does not respect
the locale or collation setting. When you insert data into the fact table of a pre-join
projection, referential integrity checks are not locale or collation aware.
For example:
locale LDE_S1 -- German
=> CREATE TABLE dim (col1 varchar(20) primary key);
=> CREATE TABLE fact (col1 varchar(20) references dim(col1));
=> CREATE PROJECTION pj AS SELECT * FROM fact JOIN dim
ON fact.col1 = dim.col1 UNSEGMENTED ALL NODES;
=> INSERT INTO dim VALUES('ß');
=> COMMIT;
The following INSERT statement fails with a "nonexistent FK" error even though 'ß' is
in the dim table, and in the German locale 'SS' and 'ß' refer to the same character.
=> INSERT INTO fact VALUES('SS');
ERROR: Nonexistent foreign key value detected in FK-PK join (fact x dim)

using subquery and dim_node0001; value SS
=> => ROLLBACK;
=> DROP TABLE dim, fact CASCADE;
l When the locale is non-binary, Vertica uses the COLLATION function to transform the
input to a binary string that sorts in the proper order.
This transformation increases the number of bytes required for the input according to
this formula:
result_column_width = input_octet_width * CollationExpansion + 4
The default value of the CollationExpansion configuration parameter is 5.
l CHAR fields are displayed as fixed length, including any trailing spaces. When
CHAR fields are processed internally, they are first stripped of trailing spaces. For
VARCHAR fields, trailing spaces are usually treated as significant characters;
however, trailing spaces are ignored when sorting or comparing either type of
character string field using a non-BINARY locale.
Change Transaction Isolation Levels
By default, Vertica uses the READ COMMITTED isolation level for every session. If you
prefer, you can change the default isolation level for the database or for a specific
session.
To change the isolation level for a specific session, use the SET SESSION
CHARACTERISTICS command.
To change the isolation level for the database, use the TransactionIsolationLevel
configuration parameter. Once modified, Vertica uses the new transaction level for every
new session.
To set the isolation level for the database to SERIALIZABLE or READ COMMITTED:
=> ALTER DATABASE mydb SET TransactionIsolationLevel = 'SERIALIZABLE';
=> ALTER DATABASE mydb SET TransactionIsolationLevel = 'READ COMMITTED';
To see the value of the isolation level:
=> SHOW TRANSACTION_ISOLATION;

A change to isolation level only applies to future sessions. Existing sessions and their
transactions continue to use the original isolation level.
A transaction retains its isolation level until it completes, even if the session's isolation
level changes during the transaction. Vertica internal processes (such as the Tuple
Mover and refresh operations) and DDL operations always run at the SERIALIZABLE
isolation level to ensure consistency.
See Also
l Transactions
l SET SESSION CHARACTERISTICS
l SHOW

Configuration Parameters
Configuration parameters are settings that affect database behavior. You can use
configuration parameters to enable, disable, or tune features related to different
database aspects like Tuple Mover, security, Database Designer, or projections.
Configuration parameters have default values, stored in the Vertica database.
You can modify certain parameters to configure your Vertica database in two ways:
l Management Console browser-based interface
l VSQL statements
Before you modify a database parameter, review all documentation about the parameter
to determine the context under which you can change it. Some parameter changes
require a database restarrt to take effect. The CHANGE_REQUIRES_RESTART column in
the system table CONFIGURATION_PARAMETERS indicates whether a parameter requires
a restart.
Managing Configuration Parameters: Management Console
To change database settings for any MC-managed database, click the Settings tab at
the bottom of the Overview, Activity, or Manage pages. The database must be running.
The Settings page defaults to parameters in the General category. To change other
parameters, click an option from the tab panel on the left.

Some settings require you to restart the database, and MC prompts you to do so. You
can ignore the prompt, but those changes take effect only after the database restarts.
Some settings are specific to Management Console, such as changing MC or agent port
assignments. For more information, see Managing MC Settings in Using Management
Console.
Managing Configuration Parameters: VSQL
You can configure all parameters at database scope. Some parameters can also be set
and cleared at node and session scopes.
Caution: Vertica is designed to operate with minimal configuration changes. Be
careful to set and change configuration parameters according to documented
guidelines.
For detailed information about managing configuration parameters, see:
l Viewing Configuration Parameter Values
l Setting Configuration Parameter Values
l Clearing Configuration Parameters
Viewing Configuration Parameter Values
You can view active configuration parameter values in two ways:

l SHOW statements
l Query related system tables
SHOW Statements
Use the following SHOW statements to view active configuration parameters:
l SHOW CURRENT: Returns settings of active configuration parameter values. Vertica
checks settings at all levels, in the following ascending order of precedence:
n session
n node
n database
If no values are set at any scope, SHOW CURRENT returns the parameter's default
value.
l SHOW DATABASE: Displays configuration parameter values set for the database.
l SHOW SESSION: Displays configuration parameter values set for the current session.
l SHOW NODE: Displays configuration parameter values set for a node.
If a configuration parameter requires a restart to take effect, the values in a
SHOW CURRENT statement might differ from values in other SHOW statements. To see
which parameters require restart, query the CONFIGURATION_PARAMETERS system table.
System Tables
You can query two system tables for configuration parameters:
l SESSION_PARAMETERS returns session-scope parameters.
l CONFIGURATION_PARAMETERS returns parameters for all scopes: database, node,
and session.
Setting Configuration Parameter Values
You can set configuration parameters at three scopes:
l Database
l Node

l Session
Database Scope
You can set one or more parameter values at the database scope with
ALTER DATABASE..SET:
ALTER DATABASE dbname SET parameter-name = value[,...];
For example:
ALTER DATABASE mydb SET AnalyzeRowCountInterval = 3600, FailoverToStandbyAfter = '5 minutes';
Node Scope
You can set one or more parameter values at the node scope with ALTER NODE..SET:
ALTER NODE node-name SET parameter-name = value[,...];
For example, to prevent clients from connecting to node01, set the MaxClientSessions
configuration parameter to 0:
=> ALTER NODE node01 SET MaxClientSessions = 0;
Session Scope
You can set one or more parameter values at the session scope with
ALTER SESSION..SET:
ALTER SESSION SET parameter-name = value[,...];
For example:
=> ALTER SESSION SET ForceUDxFencedMode = 1;
Clearing Configuration Parameters
You can clear configuration parameters at three scopes:
l Database
l Node
l Session
Database Scope
You can clear one or more parameter values at the database scope with
ALTER DATABASE..CLEAR. Vertica resets the parameter to its default value:

ALTER DATABASE dbname CLEAR parameter-name[,...];
For example:
ALTER DATABASE mydb CLEAR AnalyzeRowCountInterval, FailoverToStandbyAfter;
Node Scope
You can clear one or more parameter values at the node scope with
ALTER NODE..CLEAR. Vertica resets the parameter to its database setting, if any. If the
parameter is not set at the database scope, Vertica resets the parameter to its default
value.
ALTER NODE node-name CLEAR parameter-name[,...];
The following example clears MaxClientSessions on node node01:
ALTER NODE node01 CLEAR MaxClientSessions;
Session Scope
You can clear one or more parameter values at the session scope with
ALTER SESSION..CLEAR. Vertica resets the parameter to its node or database setting, if
any. If the parameter is not set at either scope, Vertica resets the parameter to its default
value.
ALTER SESSION CLEAR parameter-name[,...];
For example:
=> ALTER SESSION CLEAR ForceUDxFencedMode;
Configuration Parameter Categories
Vertica configuration parameters are grouped into the following categories:
General Parameters
Tuple Mover Parameters
Projection Parameters
Epoch Management Parameters
Monitoring Parameters
Profiling Parameters
Security Parameters

Database Designer Parameters
Internationalization Parameters
Data Collector Parameters
Text Search Parameters
Kerberos Authentication Parameters
HCatalog Connector Parameters
Constraint Enforcement Parameters
General Parameters
You use these general parameters to configure Vertica.
Parameter Description
AnalyzeRowCountInterval Specifies how often Vertica checks the number
of projection rows and whether the threshold
set by ARCCommitPercentage has been
crossed.
For more information, see Collecting Statistics.
Default Value: 60 seconds
ARCCommitPercentage Sets the threshold percentage of WOS to ROS
rows, which determines when to aggregate
projection row counts and commit the result to
the catalog.
Default Value: 3 (percent)
CompressCatalogOnDisk Compresses the size of the catalog on disk
when enabled (value set to 1 or 2).
Default Value: 0
Valid Values:
l 1: Compress checkpoints, but not logs
l 2: Compress checkpoints and logs

Consider enabling this parameter if the catalog
disk partition is small (<50 GB) and the
metadata is large (hundreds of tables,
partitions, or nodes).
CompressNetworkData Compresses all data sent over the internal
network when enabled (value set to 1). This
compression speeds up network traffic at the
expense of added CPU load. If the network is
throttling database performance, enable
compression to correct the issue.
Default Value: 0
DatabaseHeartBeatInterval Determines the interval (in seconds) at which
each node performs a health check and
communicates a heartbeat. If a node does not
receive a message within five times of the
specified interval, the node is evicted from the
cluster. Setting the interval to 0 disables the
feature.
Default Value: 120
See Automatic Eviction of Unhealthy Nodes.
EnableCooperativeParse Implements multi-threaded parsing capabilities
on a node. You can use this parameter for both
delimited and fixed-width loads. Enabled by
default.
Default Value: 1
EnableDataTargetParallelism Enables multiple threads for sorting and writing
data to ROS, improving data loading
performance. Enabled by default.
Default Value: 1

EnableForceOuter Determines whether Vertica uses a table's
force_outer value to implement a join. For
more information, see Controlling Join Inputs.
Default Value: 0 (forced join inputs disabled)
EnableResourcePoolCPUAffinity Aligns queries to the resource pool of the
processing CPU. When disabled (value is set
to 0), queries run on any CPU, regardless of the
CPU_AFFINITY_SET of the resource pool.
Enabled by default.
Default Value: 1
EnableStorageBundling Enables storing multiple storage container
ROSes as a single file. Each ROS must be less
than the size specified in
MaxBundleableROSSizeKB. In environments
with many small storage files, bundling
improves the performance of any file-intensive
operations, including backups, restores,
mergeouts and moveouts.
Default Value: 1
EnableUniquenessOptimization Enables query optimization that is based on
guaranteed uniqueness of column values.
Columns that can be guaranteed to include
unique values include:
l Columns that are defined with AUTO_
INCREMENT or IDENTITY constraints
l Primary key columns where key constraints
are enforced
l Columns that are constrained to unique
values, either individually or as a set

Default Value: 1 (enabled)
EnableWithClauseMaterialization Enables materialization of WITH clause results.
When materialization is enabled, Vertica
evaluates each WITH clause once and stores
results in a temporary table. This parameter can
only be set at session level.
For more information, see WITH Clauses in
SELECT in Analyzing Data.
Default Value: 0 (disabled)
ExternalTablesExceptionsLimit Determines the maximum number of COPY
exceptions and rejections allowed when a
SELECT statement references an external table.
Set to -1 to remove any exceptions limit. See
Validating External Tables.
Default Value: 100
FailoverToStandbyAfter Specifies the length of time that an active
standby node waits before taking the place of a
failed node.
This parameter takes Interval Values.
Default Value: None
FencedUDxMemoryLimitMB Sets the maximum amount of memory, in
megabytes (MB), that a fenced-mode UDF can
use. If a UDF attempts to allocate more memory
than this limit, that attempt triggers an
exception. For more information, see Fenced
Mode in Extending Vertica.
Default Value: -1 (no limit)
FlexTableDataTypeGuessMultiplier Specifies the multiplier to use for a key value
when creating a view for a flex keys table.

Default Value: 2.0
See Setting Flex Table Parameters.
FlexTableRawSize Defines the default size (in bytes) of the __raw_
_ column of a flex table. The maximum value is
32000000. See Setting Flex Table Parameters.
Default Value: 130000
JavaBinaryForUDx Sets the full path to the Java executable that
Vertica uses to run Java UDxs. See Installing
Java on Hosts in Extending Vertica.
JavaClassPathForUDx Sets the Java classpath for the JVM that
executes Java UDxs.
Default Value: ${vertica_home}
/packages/hcat/lib/*
Required Values: Must list all directories
containing JAR files that Java UDxs import.
See Handling Java UDx Dependencies in
Extending Vertica.
MaxAutoSegColumns Specifies the number of columns (0–1024) to
segment automatically when creating auto-
projections from COPY and INSERT INTO
statements. Setting this parameter to zero (0)
uses all columns in the hash segmentation
expression.
Default Value: 32
MaxBundleableROSSizeKB Specifies the minimum size, in kilobytes, of an
independent ROS file. When
EnableStorageBundling is true, Vertica bundles
storage container ROS files below this size into
a single file. Bundling improves the

performance of any file-intensive operations,
including backups, restores, mergeouts and
moveouts.
If you enable storage bundling and specify this
parameter with a value of 0, Vertica bundles
.fdb and .pidx files without bundling other
storage container files.
Default Value: 1024
MaxClientSessions Determines the maximum number of client
sessions that can run on a single node of the
database. The default value allows for five
additional administrative logins. These logins
prevent DBAs from being locked out of the
system if non-dbadmin users reach the login
limit.
Default Value: 50 user logins and 5 additional
administrative logins
Tip: Setting this parameter to 0 prevents new
client sessions from being opened while you
are shutting down the database. Restore the
parameter to its original setting after you restart
the database. See the section "Interrupting and
Closing Sessions" in Managing Sessions.
PatternMatchAllocator Overrides the heap memory allocator for the
pattern-match library when set to 1. The Perl
Compatible Regular Expressions (PCRE)
pattern-match library evaluates regular
expressions. Restart the database for this
parameter to take effect. For more information,
see Regular Expression Functions.
Default Value: 0

PatternMatchingUseJit Enables just-in-time compilation (to machine
code) of regular expression pattern matching
functions used in queries. Using this parameter
can usually improve pattern matching
performance on large tables. The Perl
Default Value: 1
PcreJitStackMaxSizeScaleFactor Determines the maximum size of the Perl
Compatible Regular Expressions (PCRE) just-
in-time stack. The maximum stack size will be
PcreJitStackMaxSizeScaleFactor * 1024 * 1024
bytes.
Default Value: 32
PatternMatchStackAllocator Overrides the stack memory allocator for the
pattern-match library when set to 1. The Perl
Default Value: 1
SegmentAutoProjection Determines whether auto-projections are
segmented by default. Set to 0 to disable.
Default Value: 1
TerraceRoutingFactor Specifies a value large enough that it cannot be
enabled by default, even for the largest clusters.

Use the Terrace Routing equation to find the
appropriate value for your cluster. For more
information, see Terrace Routing.
Default Value: 1000.0
TransactionIsolationLevel Changes the isolation level for the database.
After modification, Vertica uses the new
transaction level for every new session.
Existing sessions and their transactions
continue to use the original isolation level. See
Change Transaction Isolation Levels.
Default Value: READ COMMITTED
TransactionMode Specifies whether transactions are in read/write
or read-only modes. Read/write is the default.
Existing sessions and their transactions
continue to use the original isolation level.
Default Value: READ WRITE
Tuple Mover Parameters
These parameters control how the Tuple Mover operates.
Parameters Description
ActivePartitionCount Sets the number of partitions, called active partitions, that are
currently being loaded. For information about how the Tuple
Mover treats active (and inactive) partitions during a
mergeout operation, see Understanding the Tuple Mover.
Default Value: 1
Example:
ALTER DATABASE mydb SET ActivePartitionCount = 2;
MergeOutInterval The number of seconds the Tuple Mover waits between
checks for new ROS files to merge out. If ROS containers are
added frequently, you may need to decrease this value.

Default Value: 600
Example:
ALTER DATABASE mydb SET MergeOutInterval = 1200;
MoveOutInterval The number of seconds the Tuple Mover waits between
checks for new data in the WOS to move to ROS.
Default Value: 300
Example:
ALTER DATABASE mydb SET MoveOutInterval = 600;
MoveOutMaxAgeTime The specified interval (in seconds) after which the Tuple
Mover is forced to write the WOS to disk. The default interval
is 30 minutes.
Tip: If you had been running the force_moveout.sh script
in previous releases, you no longer need to run it.
Default Value: 1800
Example:
ALTER DATABASE mydb SET MoveOutMaxAgeTime = 1200;
MoveOutSizePct The percentage of the WOS that can be filled with data
before the Tuple Mover performs a moveout operation.
Default Value: 0
Example:
ALTER DATABASE mydb SET MoveOutSizePct = 50;
Projection Parameters
The following configuration parameters help you manage projections.
AnalyzeRowCountInterval Specifies how often Vertica checks the
number of projection rows and whether the
threshold set by ARCCommitPercentage
has been crossed.
For more information, see Collecting

Statistics.
Default Value: 60 seconds
ARCCommitPercentage Sets the threshold percentage of WOS to
ROS rows, which determines when to
aggregate projection row counts and commit
the result to the catalog.
Default Value: 3 (percent)
ContainersPerProjectionLimit Specifies how many ROS containers Vertica
creates per projection before ROS pushback
occurs.
Default Value: 1024
Caution: Increasing this parameter's
value can cause serious degradation of
database performance. Vertica strongly
recommends that you not modify this
parameter without first consulting with
Customer Support professionals.
EnableGroupByProjections When set to 1, you can create live aggregate
projections. For more information, see Live
Aggregate Projections.
Default Value: 1
EnableExprsInProjections When set to 1, you can create projections
that use expressions to calculate column
values. For more information, see
Aggregating Data Through Expressions.
Default Value: 1
EnableSingleNamedUnsegProjections Determines whether replicas of an
unsegmented projection all map to a single

name, or conform to pre-Vertica 7.2.
behavior. Set this variable to one of the
following values:
l 0: Projection names conform to pre-7.2.
behavior, where each projection instance
on a given node has a unique identifier
that conforms to this convention:
projection-basename_nodeID
l 1: All instances of a new unsegmented
projection map to a single name. For
more information, see Creating
Unsegmented Projections.
Default Value: 1
EnableTopKProjections When set to 1, you can create Top-K
projections that let you retrieve Top-K data
quickly. For more information, see Top-K
Projections.
Default Value: 1
MaxAutoSegColumns Specifies the number of columns (0 –1024)
to segment automatically when creating
auto-projections from COPY and INSERT
INTO statements.
Set to 0 to use all columns in the hash
segmentation expression.
Default Value: 32
SegmentAutoProjection Determines whether auto-projections are
segmented by default. Set to 0 to disable.

Epoch Management Parameters
The following table describes the epoch management parameters for configuring
Vertica.
AdvanceAHMInterval Determines how frequently (in seconds) Vertica checks the
history retention status.
Note: AdvanceAHMInterval cannot be set to a value that
is less than the EpochMapInterval.
Default Value: 180 (3 minutes)
Example:
ALTER DATABASE mydb SET AdvanceAHMInterval = '3600';
EpochMapInterval Determines the granularity of mapping between epochs
and time available to historical queries. When a historical
queries AT TIME T request is issued, Vertica maps it to an
epoch within a granularity of EpochMapInterval seconds. It
similarly affects the time reported for Last Good Epoch
during Failure Recovery. Note that it does not affect
internal precision of epochs themselves.
Tip: Decreasing this interval increases the number of
epochs saved on disk. Therefore, consider reducing the
HistoryRetentionTime parameter to limit the number of
history epochs that Vertica retains.
Default Value: 180 (3 minutes)
Example:
ALTER DATABASE mydb SET EpochMapInterval = '300';
HistoryRetentionTime Determines how long deleted data is saved (in seconds)
as an historical reference. When the specified time since
the deletion has passed, you can purge the data. Use the -
1 setting if you prefer to use HistoryRetentionEpochs to
determine which deleted data can be purged.
Note: The default setting of 0 effectively prevents the use

of the Administration Tools 'Roll Back Database to Last
Good Epoch' option because the AHM remains close to the
current epoch and a rollback is not permitted to an epoch
prior to the AHM.
Tip: If you rely on the Roll Back option to remove recently
loaded data, consider setting a day-wide window to
remove loaded data. For example:
ALTER DATABASE mydb SET HistoryRetentionTime = 86400;
Default Value: 0 (Data saved when nodes are down.)
Example:
ALTER DATABASE mydb SET HistoryRetentionTime = '240';
HistoryRetentionEpochs Specifies the number of historical epochs to save, and
therefore, the amount of deleted data.
Unless you have a reason to limit the number of epochs,
HPE recommends that you specify the time over which
deleted data is saved.
If you specify both History parameters,
HistoryRetentionTime takes precedence. Setting both
parameters to -1, preserves all historical data.
See Setting a Purge Policy.
Default Value: -1 (Disabled)
Example:
ALTER DATABASE mydb SET HistoryRetentionEpochs = '40';
Monitoring Parameters
The following table describes the monitoring parameters for configuring Vertica.
SnmpTrapDestinationsL
ist
Defines where Vertica sends traps for SNMP. See
Configuring Reporting for SNMP.
Default Value: none

Example:
ALTER DATABASE mydb SET SnmpTrapDestinationsList = 'localhost 162
public';
SnmpTrapsEnabled Enables event trapping for SNMP. See Configuring
Reporting for SNMP.
Default Value: 0
Example:
ALTER DATABASE mydb SET SnmpTrapsEnabled = 1;
SnmpTrapEvents Define which events Vertica traps through SNMP. See
Configuring Reporting for SNMP.
Default Value: Low Disk Space, Read Only File System,
Loss of K Safety, Current Fault Tolerance at Critical Level,
Too Many ROS Containers, WOS Over Flow, Node State
Change, Recovery Failure, and Stale Checkpoint
Example:
ALTER DATABASE mydb SET SnmpTrapEvents = 'Low Disk
Space, Recovery Failure';
SyslogEnabled Enables event trapping for syslog. See Configuring
Reporting for Syslog.
Default Value: 0
Example:
ALTER DATABASE mydb SET SyslogEnabled = 1 );
SyslogEvents Defines events that generate a syslog entry. See
Configuring Reporting for Syslog.
Default Value: none
Example:
ALTER DATABASE mydb SET SyslogEvents = 'Low Disk
Space, Recovery Failure';
SyslogFacility Defines which SyslogFacility Vertica uses. See
Configuring Reporting for Syslog.

Default Value: user
Example:
ALTER DATABASE mydb SET SyslogFacility = 'ftp';
Profiling Parameters
The following table describes the profiling parameters for configuring Vertica. See
Profiling Database Performance for more information on profiling queries.
GlobalEEProfiling Enables profiling for query execution runs in all sessions on
all nodes.
Default Value: 0
Example:
ALTER DATABASE mydb SET GlobalEEProfiling = 1;
GlobalQueryProfiling Enables query profiling for all sessions on all nodes.
Default Value: 0
Example:
ALTER DATABASE mydb SET GlobalQueryProfiling = 1;
GlobalSessionProfiling Enables session profiling for all sessions on all nodes.
Default Value: 0
Example:
ALTER DATABASE mydb SET GlobalSessionProfiling = 1;
Security Parameters
Use these client authentication configuration parameters and general security
parameters to configure security.
EnableAllRolesOnL
ogin
Automatically enables all roles granted to a user once that
user logs in. Enabling this eliminates the need for the user to
run SET ROLE <rolenames>. Valid values are:
0 - does not automatically enable roles

1 - automatically enables roles
Default Value: 0
EnabledCipherSuite
s
Indicates which SSL cipher-suites to use for secure client-
server communication.
Default
Value:ALL:!ADH:!LOW:!EXP:!MD5:!RC4:@STRENGTH
This setting excludes weaker cipher suites.
Find a complete mapping of cipher suite names from JSSE to
OpenSSL at openssl.org.
EnableSSL Enables SSL for the server. See Implementing SSL.
Default Value: 0
Example:
ALTER DATABASE mydb SET EnableSSL = '1';
RestrictSystemTable
s
Prohibits non-database administrator users from viewing
sensitive information in system tables. Valid values are:
0 - Allows all users to access system tables
1 — Limits access to system tables to database administrator
users
Default Value: 0
See System Table Restriction.
SecurityAlgorithm Sets the algorithm for the function that hash authentication
uses MD5 or SHA-512.
Default Value:'NONE'
Example:
ALTER DATABASE mydb SET SecurityAlgorithm = 'MD5';
ALTER DATABASE mydb SET SecurityAlgorithm = 'SHA512';

SSLCA Sets the SSL certificate authority.
Default Value: No default value
Example:
ALTER DATABASE mydb SET SSLCA = '<contents of certificate authority
root.crt file>';
Include the contents of the certificate authority, root.crt, file,
but do not include the file name.
SSLCertificate Sets the SSL certificate. If your SSL certificate is a certificate
chain, cut and paste only the top-most certificate of the
certificate chain to set this value.
Example:
ALTER DATABASE mydb SET SSLCertificate = '<contents of server.crt
file>';
Include the contents of the server.crt file, but do not include
the file name.
Note: This parameter gets set automatically during
upgrade to 7.1 if you set EnableSSL=1 prior to the
upgrade.
SSLPrivateKey Specifies the server's private key. The value of this parameter
is visible only to dbadmin users.
Example:
ALTER DATABASE mydb SET SSLPrivateKey = '<contents of server.key
file>';
Include the contents of the server.key file, but do not include
the file name.

Note: This parameter gets set automatically during
upgrade to 7.1 if you set EnableSSL=1 prior to the
upgrade.
View parameter values with the statement, SHOW DATABASE. You must be a database
superuser to view the value:
SHOW DATABASE mydb SSLCertificate;
See Also
Configuring SSL
Database Designer Parameters
The following table describes the parameters for configuring the Vertica Database
Designer.
DBDCorrelationSampleRowCou
nt
Minimum number of table rows at which
Database Designer discovers and records
correlated columns.
Default Value: 4000
Example:
ALTER DATABASE mydb SET DBDCorrelationSampleRowCount =
3000;
DBDLogInternalDesignProcess Enables or disables Database Designer logging.
Default value: False
Examples:
ALTER DATABASE mydb SET DBDLogInternalDesignProcess = '1';
ALTER DATABASE mydb SET DBDLogInternalDesignProcess = '0';
Internationalization Parameters
The following table describes the internationalization parameters for configuring Vertica.

DefaultIntervalStyle Sets the default interval style to use. If set to 0 (default),
the interval is in PLAIN style (the SQL standard), no
interval units on output. If set to 1, the interval is in
UNITS on output. This parameter does not take effect
until the database is restarted.
Default Value: 0
Example:
ALTER DATABASE mydb SET DefaultIntervalStyle = 1;
DefaultSessionLocale Sets the default session startup locale for the database.
This parameter does not take effect until the database
is restarted.
Default Value: en_US@collation=binary
Example:
ALTER DATABASE mydb SET DefaultSessionLocale = 'en_GB';
EscapeStringWarning Issues a warning when back slashes are used in a
string literal. This is provided to help locate back
slashes that are being treated as escape characters so
they can be fixed to follow the Standard conforming
string syntax instead.
Default Value: 1
Example:
ALTER DATABASE mydb SET EscapeStringWarning = '1';
StandardConformingStrings Determines whether ordinary string literals ('...') treat
backslashes () as string literals or escape characters.
When set to '1', backslashes are treated as string
literals, when set to '0', back slashes are treated as
escape characters.
Tip: To treat backslashes as escape characters, use
the Extended string syntax:
(E'...');

See String Literals (Character) in the SQL Reference
Manual.
Default Value: 1
Example:
ALTER DATABASE mydb SET StandardConformingStrings = '0';
Data Collector Parameters
The following table lists the Data Collector parameter for configuring Vertica.
EnableDataCollector Enables and disables the Data Collector, which is the
Workload Analyzer's internal diagnostics utility. Affects all
sessions on all nodes. Use 0 to turn off data collection.
Default value: 1 (Enabled)
Example:
ALTER DATABASE mydb SET EnableDataCollector = 0;
For more information, see the following topics in the SQL Reference Manual:
l Data Collector Functions
l ANALYZE_WORKLOAD
l V_MONITOR.DATA_COLLECTOR
l V_MONITOR.TUNING_RECOMMENDATIONS
See also the following topics in the Administrator's Guide
l Retaining Monitoring Information
l Analyzing Workloads
l Tuning Recommendations
l Analyzing Workloads Through Management Console and Through an API
Text Search Parameters
You can configure Vertica for text search using these parameters.

TextIndexMaxTokenLeng
th
Controls the maximum size of a token in a text index. If
the parameter is set to a value greater than 65000
characters, then the tokenizer truncates the token at
65000 characters.
You should avoid setting this parameter near 65000 (the
maximum value). Doing so can result in a significant
decrease in performance. For optimal performance, the
parameter should be set to the maximum token value of
your tokenizer.
Default Value: 128 characters
Example:
ALTER DATABASE database_name SET PARAMETER
TextIndexMaxTokenLength=760;
The following parameters let you configure the Vertica principal for Kerberos
authentication and specify the location of the Kerberos keytab file.
KerberosServiceNa
me
Provides the service name portion of the Vertica Kerberos
principal. By default, this parameter is 'vertica'. For example:
vertica/host@EXAMPLE.COM.
KerberosHostname [Optional] Provides the instance or host name portion of the
Vertica Kerberos principal. For example:
vertica/host@EXAMPLE.COM
If you omit the optional KerberosHostname parameter,
Vertica uses the return value from the gethostname()
function. Assuming each cluster node has a different host
name, those nodes will each have a different principal, which
you must manage in that node's keytab file.
KerberosRealm Provides the realm portion of the Vertica Kerberos principal. A
realm is the authentication administrative domain and is
usually formed in uppercase letters; for example:

vertica/host@EXAMPLE.COM.
KerberosKeytabFile Provides the location of the keytab file that contains
credentials for the Vertica Kerberos principal. By default, this
file is located in /etc. For example:
KerberosKeytabFile=/etc/krb5.keytab.
Notes:
l The principal must take the form
KerberosServiceName/KerberosHostName@KerberosR
ealm
l The keytab file must be readable by the file owner who is
running the process (typically the Linux dbadmin user
assigned file permissions 0600).
HCatalog Connector Parameters
The following table describes the parameters for configuring the HCatalog Connector.
See Using the HCatalog Connector in Integrating with Hadoop for more information.
HCatConnectionTimeout The number of seconds the HCatalog Connector waits for
a successful connection to the WebHCat server before
returning a timeout error.
Default Value: 0 (Wait indefinitely)
Requires Restart: No
Example:
ALTER DATABASE mydb SET HCatConnectionTimeout = 30;
HCatSlowTransferLimit The lowest transfer speed (in bytes per second) that the
HCatalog Connector allows when retrieving data from the
WebHCat server. In some cases, the data transfer rate
from the WebHCat server to Vertica is below this
threshold. In such cases, after the number of seconds
specified in the HCatSlowTransferTime parameter pass,

the HCatalog Connector cancels the query and closes the
connection.
Default Value: 65536
Example:
ALTER DATABASE mydb SET HCatSlowTransferLimit = 32000;
HCatSlowTransferTime The number of seconds the HCatalog Connector waits
before testing whether the data transfer from the WebHCat
server is too slow. See the HCatSlowTransferLimit
parameter.
Default Value: 60
Example:
ALTER DATABASE mydb SET HCatSlowTransferTime = 90;
Note: You can override these configuration parameters when creating an HCatalog
schema. See CREATE HCATALOG SCHEMA in the SQL Reference Manual for an
explanation.
You can use the following parameter to configure the length of a value in user-defined
session parameters with Vertica.
MaxSessionUDParameterSize Sets the maximum length of a value in a user-
defined session parameter.
Default Value: 1000
Example:
=> ALTER SESSION SET MaxSessionUDParameterSize = 2000
Related Topics
l User-Defined Session Parameters

Constraint Enforcement Parameters
The following configuration parameters enforce PRIMARY and UNIQUE key
constraints.
Use the ALTER DATABASE statement to set these parameters. You do not need to
restart your database after setting them.
l The parameter settings apply for any PRIMARY or UNIQUE key constraint that you
have not explicitly enabled or disabled within a CREATE TABLE or ALTER TABLE
statement.
l Any new PRIMARY or UNIQUE key constraint that you create or alter is set
according to the value of the corresponding parameter unless you specifically
enabled or disabled the constraint.
Important: Setting a constraint as enabled or disabled when you create or alter it
using CREATE TABLE or ALTER TABLE overrides the parameter setting.
EnableNewPrimaryKeysByDefaul
t
Set to 1 to automatically enable newly created
PRIMARY KEY constraints that you specified
through CREATE TABLE or ALTER TABLE
statements. However, if you have explicitly
disabled a constraint when you created or altered
it, it is not enforced.
Default Value: 0 (Disabled)
Example:
ALTER DATABASE mydb SET EnableNewPrimaryKeysByDefault
= 1;
EnableNewUniqueKeysByDefault Set to 1 to automatically enable newly created
Unique constraints that you specified through
CREATE TABLE or ALTER TABLE statements.
However, if you have explicitly disabled a
constraint when you created or altered it, it is not
enforced.

Default Value: 0 (Disabled)
Example:
ALTER DATABASE mydb SET EnableNewUniqueKeysByDefault
= 1;
Note: Vertica recommends enabling primary key enforcement if you have enabled
unique key enforcement.
Vertica Library for Amazon Web Services Parameters
Use these parameters to configure the VerticaLibrary for Amazon Web Services (AWS).
All parameters listed are case sensitive.
aws_id Your AWS access key ID.
Example:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_id='<YOUR AWS ID>';
aws_secret Your AWS secret access key.
Example:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='<YOUR AWS KEY>';
aws_region The region your S3 bucket is located. aws_region can only be
configured with one region at a time. If you need to access buckets in
multiple regions, you must re-set the parameter each time you change
regions.
Default value: us-east-1
You can find more information about AWS regions in the Amazon
Documentation.
Example:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_region='<REGION IDENTIFIER>';
For more information, see the following topics in the Administrator's Guide:

l AWS Library
l Configuring Vertica AWS Library
l Export AWS Library
l Import AWS Library

Designing a Logical Schema
Designing a logical schema for a Vertica database is the same as designing for any
other SQL database. A logical schema consists of objects such as schemas, tables,
views and referential Integrity constraints that are visible to SQL users. Vertica supports
any relational schema design that you choose.

Using Multiple Schemas
Using a single schema is effective if there is only one database user or if a few users
cooperate in sharing the database. In many cases, however, it makes sense to use
additional schemas to allow users and their applications to create and access tables in
separate namespaces. For example, using additional schemas allows:
l Many users to access the database without interfering with one another.
Individual schemas can be configured to grant specific users access to the schema
and its tables while restricting others.
l Third-party applications to create tables that have the same name in different
schemas, preventing table collisions.
Unlike other RDBMS, a schema in an Vertica database is not a collection of objects
bound to one user.
Multiple Schema Examples
This section provides examples of when and how you might want to use multiple
schemas to separate database users. These examples fall into two categories: using
multiple private schemas and using a combination of private schemas (i.e. schemas
limited to a single user) and shared schemas (i.e. schemas shared across multiple
users).
Using Multiple Private Schemas
Using multiple private schemas is an effective way of separating database users from
one another when sensitive information is involved. Typically a user is granted access
to only one schema and its contents, thus providing database security at the schema
level. Database users can be running different applications, multiple copies of the same
application, or even multiple instances of the same application. This enables you to
consolidate applications on one database to reduce management overhead and use
resources more effectively. The following examples highlight using multiple private
schemas.
l Using Multiple Schemas to Separate Users and Their Unique Applications
In this example, both database users work for the same company. One user
(HRUser) uses a Human Resource (HR) application with access to sensitive
personal data, such as salaries, while another user (MedUser) accesses information

regarding company healthcare costs through a healthcare management application.
HRUser should not be able to access company healthcare cost information and
MedUser should not be able to view personal employee data.
To grant these users access to data they need while restricting them from data they
should not see, two schemas are created with appropriate user access, as follows:
n HRSchema—A schema owned by HRUser that is accessed by the HR
application.
n HealthSchema—A schema owned by MedUser that is accessed by the healthcare
management application.
l Using Multiple Schemas to Support Multitenancy
This example is similar to the last example in that access to sensitive data is limited
by separating users into different schemas. In this case, however, each user is using
a virtual instance of the same application.
An example of this is a retail marketing analytics company that provides data and
software as a service (SaaS) to large retailers to help them determine which
promotional methods they use are most effective at driving customer sales.
In this example, each database user equates to a retailer, and each user only has
access to its own schema. The retail marketing analytics company provides a virtual
instance of the same application to each retail customer, and each instance points to
the user’s specific schema in which to create and update tables. The tables in these
schemas use the same names because they are created by instances of the same
application, but they do not conflict because they are in separate schemas.
Example of schemas in this database could be:
n MartSchema—A schema owned by MartUser, a large department store chain.
n PharmSchema—A schema owned by PharmUser, a large drug store chain.
l Using Multiple Schemas to Migrate to a Newer Version of an Application
Using multiple schemas is an effective way of migrating to a new version of a
software application. In this case, a new schema is created to support the new

version of the software, and the old schema is kept as long as necessary to support
the original version of the software. This is called a “rolling application upgrade.”
For example, a company might use a HR application to store employee data. The
following schemas could be used for the original and updated versions of the
software:
n HRSchema—A schema owned by HRUser, the schema user for the original HR
application.
n V2HRSchema—A schema owned by V2HRUser, the schema user for the new
version of the HR application.
Using Combinations of Private and Shared Schemas
The previous examples illustrate cases in which all schemas in the database are private
and no information is shared between users. However, users might want to share
common data. In the retail case, for example, MartUser and PharmUser might want to
compare their per store sales of a particular product against the industry per store sales
average. Since this information is an industry average and is not specific to any retail
chain, it can be placed in a schema on which both users are granted USAGE privileges.
(For more information about schema privileges, see Schema Privileges.)
Example of schemas in this database could be:
l MartSchema—A schema owned by MartUser, a large department store chain.
l PharmSchema—A schema owned by PharmUser, a large drug store chain.
l IndustrySchema—A schema owned by DBUser (from the retail marketing analytics
company) on which both MartUser and PharmUser have USAGE privileges. It is
unlikely that retailers would be given any privileges beyond USAGE on the schema

and SELECT on one or more of its tables.
Creating Schemas
You can create as many schemas as necessary for your database. For example, you
could create a schema for each database user. However, schemas and users are not
synonymous as they are in Oracle.
By default, only a superuser can create a schema or give a user the right to create a
schema. (See GRANT (Database) in the SQL Reference Manual.)
To create a schema use the CREATE SCHEMA statement, as described in the SQL
Reference Manual.
Specifying Objects in Multiple Schemas
Once you create two or more schemas, each SQL statement or function must identify the
schema associated with the object you are referencing. You can specify an object within
multiple schemas by:
l Qualifying the object name by using the schema name and object name separated by
a dot. For example, to specify MyTable, located in Schema1, qualify the name as
Schema1.MyTable.
l Using a search path that includes the desired schemas when a referenced object is
unqualified. By Setting Search Paths, Vertica will automatically search the specified
schemas to find the object.

Setting Search Paths
The search path is a list of schemas where Vertica looks for tables and User Defined
Functions (UDFs) that are referenced without a schema name. For example, if a
statement references a table named Customers without naming the schema that
contains the table, and the search path is public, Schema1, and Schema2, Vertica first
searches the public schema for a table named Customers. If it does not find a table
named Customers in public, it searches Schema1 and then Schema2.
Vertica uses the first table or UDF it finds that matches the unqualified reference. If the
table or UDF is not found in any schema in the search path, Vertica reports an error.
Note: Vertica only searches for tables and UDFs in schemas to which the user has
access privileges. If the user does not have access to a schema in the search path,
Vertica silently skips the schema. It does not report an error or warning if the user's
search path contains one or more schemas to which the user does not have access
privileges. Any schemas in the search path that do not exist (for example, schemas
that have been deleted since being added to the search path) are also silently
ignored.
The first schema in the search path to which the user has access is called the current
schema. This is the schema where Vertica creates tables if a CREATE TABLE
statement does not specify a schema name.
The default schema search path is "$user", public, v_catalog, v_monitor, v_
internal.
=> SHOW SEARCH_PATH;
name | setting
-------------+---------------------------------------------------
search_path | "$user", public, v_catalog, v_monitor, v_internal
(1 row)
The $user entry in the search path is a placeholder that resolves to the current user
name, and public references the public schema. The v_catalog and v_monitor
schemas contain Vertica system tables, and the v_internal schema is for Vertica's
internal use.
Note: Vertica always ensures that the v_catalog, v_monitor, and v_internal schemas
are part of the schema search path.

The default search path has Vertica search for unqualified tables first in the user’s
schema, assuming that a separate schema exists for each user and that the schema
uses their user name. If such a user schema does not exist, or if Vertica cannot find the
table there, Vertica next search the public schema, and then the v_catalog and v_
monitor built-in schemas.
A database administrator can set a user's default search schema when creating the user
by using the SEARCH_PATH parameter of the CREATE USER statement. An
administrator or the user can change the user's default search path using the ALTER
USER statement's SEARCH_PATH parameter. Changes made to the default search
path using ALTER USER affect new user sessions—they do not affect any current
sessions.
A user can use the SET SEARCH_PATH statement to override the schema search path
for the current session.
Tip: The SET SEARCH_PATH statement is equivalent in function to the
CURRENT_SCHEMA statement found in some other databases.
To see the current search path, use the SHOW SEARCH_PATH statement. To view the
current schema, use SELECT CURRENT_SCHEMA(). The function SELECT
CURRENT_SCHEMA() also shows the resolved name of $user.
The following example demonstrates displaying and altering the schema search path for
the current user session:
name | setting
-------------+---------------------------------------------------
search_path | "$user", PUBLIC, v_catalog, v_monitor, v_internal
(1 row)
=> SET SEARCH_PATH TO SchemaA, "$user", public;
SET
name | setting
-------------+------------------------------------------------------------
search_path | SchemaA, "$user", public, v_catalog, v_monitor, v_internal
(1 row)
You can use the DEFAULT keyword to reset the schema search path to the default.
=> SET SEARCH_PATH TO DEFAULT;SET
name | setting

-------------+---------------------------------------------------
(1 row)
To view the default schema search path for a user, query the search_path column of the
V_CATALOG.USERS system table:
=> SELECT search_path from USERS WHERE user_name = 'ExampleUser'; search_path
---------------------------------------------------
"$user", public, v_catalog, v_monitor, v_internal
(1 row)
=> ALTER USER ExampleUser SEARCH_PATH SchemaA,"$user",public;
ALTER USER
=> SELECT search_path from USERS WHERE user_name = 'ExampleUser';
search_path
------------------------------------------------------------
SchemaA, "$user", public, v_catalog, v_monitor, v_internal
(1 row)
name | setting
-------------+---------------------------------------------------
(1 row)
Note that changing the default search path has no effect ion the user's current session.
Even using the SET SEARCH_PATH DEFAULT statement does not set the search
path to the newly-defined default value. It only has an effect in new sessions.
See Also
l Vertica System Tables
Creating Objects That Span Multiple Schemas
Vertica supports views or pre-join projections that reference tables across multiple
schemas. For example, a user might need to compare employee salaries to industry
averages. In this case, the application would query a shared schema (IndustrySchema)
for salary averages in addition to its own private schema (HRSchema) for company-
specific salary information.

Best Practice: When creating objects that span schemas, use qualified table
names. This naming convention avoids confusion if the query path or table structure
within the schemas changes at a later date.
Tables in Schemas
In Vertica you can create anchor tables and temporary tables, depending on your
objective. For example, anchor tables are created in the Vertica logical schema while
temporary tables are useful for dividing complex query processing into multiple steps.
For more information, see Creating Tables and Creating Temporary Tables.
About Base Tables
The CREATE TABLE statement creates a table in the Vertica logical schema. The
example databases described in Getting Started include sample SQL scripts that
demonstrate this procedure. For example:
CREATE TABLE vendor_dimension (
vendor_key INTEGER NOT NULL PRIMARY KEY,
vendor_name VARCHAR(64),
vendor_address VARCHAR(64),
vendor_city VARCHAR(64),
vendor_state CHAR(2),
vendor_region VARCHAR(32),
deal_size INTEGER,
last_deal_update DATE
);
Automatic Project Creation
To get your database up and running quickly, Vertica automatically creates a default
projection for each table created through the CREATE TABLE and CREATE
TEMPORARY TABLE statements. Each projection created automatically (or manually)

includes a base projection name prefix. You must use the projection prefix when altering
or dropping a projection (ALTER PROJECTION RENAME, DROP PROJECTION).
How you use CREATE TABLE determines when the projection is created:
l If you create a table without providing projection-related clauses, Vertica
automatically creates a superprojection for the table when you load data into the
table for the first time with INSERT or COPY. The projection is created in the same
schema as the table. AfterVertica creates the projection, it loads the data.
l If you use CREATE TABLE to create a table from the results of a query
(CREATE TABLE AS SELECT), the projection is created immediately after the table,
and uses some of the properties of the underlying SELECT query.
l (Advanced users only) If CREATE TABLE includes any of the following parameters,
the default projection is created immediately on table creation using the specified
properties:
n column-definition (ENCODING encoding-type and ACCESSRANK integer)
n ORDER BY table-column
n hash-segmentation-clause
n UNSEGMENTED { NODE node | ALL NODES }
n KSAFE
Note: Before you define a superprojection as described above, see Creating
Custom Designs in the Administrator's Guide.
See Also
l Creating Base Tables
l CREATE TABLE
About Temporary Tables
You create temporary tables with the CREATE TEMPORARY TABLE statement.
Temporary tables can be used to divide complex query processing into multiple steps.
Typically, a reporting tool holds intermediate results while reports are generated—for

example, the tool first gets a result set, then queries the result set, and so on. You can
also write Subqueries.
Note: By default, all temporary table data is discarded when a COMMIT statement
ends the current transaction. If CREATE TEMPORARY TABLE includes the parameter
ON COMMIT PRESERVE ROWS, table data is retained until the current session ends.
Global Temporary Tables
Vertica creates global temporary tables in the public schema, with the data contents
private to the transaction or session through which data is inserted.
Global temporary table definitions are accessible to all users and sessions, so that two
(or more) users can access the same global table concurrently. However, whenever a
user commits or rolls back a transaction, or ends the session, Vertica removes the
global temporary table data automatically, so users see only data specific to their own
transactions or session.
Global temporary table definitions persist in the database catalogs until they are
removed explicitly through a DROP TABLE statement.
Local Temporary Tables
Local temporary tables are created in the V_TEMP_SCHEMA namespace and inserted into
the user's search path transparently. Each local temporary table is visible only to the
user who creates it, and only for the duration of the session in which the table is created.
When the session ends, Vertica automatically drops the table definition from the
database catalogs. You cannot preserve non-empty, session-scoped temporary tables
using the ON COMMIT PRESERVE ROWS statement.
Creating local temporary tables is significantly faster than creating regular tables, so you
should make use of them whenever possible.
Automatic Projection Creation and Characteristics
Vertica creates auto-projections for temporary tables when you load or insert data. The
default auto-projection for a temporary table has the following characteristics:
l It is a superprojection.
l It uses the default encoding-type AUTO.
l It is automatically segmented on the table's first several columns.

l Unless the table specifies otherwise, the projection's KSAFE value is set at the current
system K-safety level.
Auto-projections are defined by the table properties and creation methods, as follows:
Table Characteristic Sort Order Segmented On:
Created from input
stream (COPY or
INSERT INTO)
Same as
input stream,
if sorted.
PK column (if any), on all FK columns (if
any), on the first 31 configurable columns of
the table
Created from CREATE
TABLE AS SELECT
query
Same as
input stream,
if sorted.
Same segmentation columns if query
output is segmented
Same as load, if query output is
unsegmented or unknown
FK and PK constraints FK first, then
PK columns
PK columns
FK constraints only FK first, then
remaining
columns
Small data type (< 8 byte) columns first,
then large data type columns
PK constraints only PK columns PK columns
No FK or PK
constraints
On all
columns
As an advanced user, you can modify the default projection created through the CREATE
TEMPORARY TABLE statement by setting one or more of the following parameters:
l column-definition (temp table) (ENCODING encoding-type and ACCESSRANK integer)
l ORDER BY table-column
l Hash-Segmentation-Clause
l UNSEGMENTED { NODE node | ALL NODES }
l NO PROJECTION

Note: Before you define the superprojection in this manner, read Creating Custom
Designs in the Administrator's Guide.
See Also
l Creating Temporary Tables
l CREATE TEMPORARY TABLE

Creating a Database Design
A design is a physical storage plan that optimizes query performance. Data in Vertica is
physically stored in projections. When you initially load data into a table using INSERT,
COPY (or COPY LOCAL), Vertica creates a default superprojection for the table. This
superprojection ensures that all of the data is available for queries. However, these
superprojections might not optimize database performance, resulting in slow query
performance and low data compression.
To improve performance, create a design for your Vertica database that optimizes query
performance and data compression. You can create a design in several ways:
l Use Database Designer, a tool that recommends a design for optimal performance.
l Manually create a design
l Use Database Designer to create an initial design and then manually modify it.
Database Designer can help you minimize how much time you spend on manual
database tuning. You can also use Database Designer to redesign the database
incrementally as requirements such as workloads change over time.
Database Designer runs as a background process. This is useful if you have a large
design that you want to run overnight. An active SSH session is not required, so design
and deploy operations continue to run uninterrupted if the session ends.
Tip: HPE recommends that you first globally optimize your database using the
Comprehensive setting in Database Designer. If the performance of the
comprehensive design is not adequate, you can design custom projections using an
incremental design and manually, as described in Creating Custom Designs.
About Database Designer
Vertica Database Designer uses sophisticated strategies to create a design that
provides excellent performance for ad-hoc queries and specific queries while using disk
space efficiently.
During the design process, Database Designer analyzes the logical schema definition,
sample data, and sample queries, and creates a physical schema (projections) in the
form of a SQL script that you deploy automatically or manually. This script creates a
minimal set of superprojections to ensure K-safety.

In most cases, the projections that Database Designer creates provide excellent query
performance within physical constraints while using disk space efficiently.
General Design Options
When you run Database Designer, several general options are available:
l Create a comprehensive or incremental design.
l Optimize for query execution, load, or a balance of both.
l Require K-safety.
l Recommend unsegmented projections when feasible.
l Analyze statistics before creating the design.
Design Input
Database Designer bases its design on the following information that you provide:
l Design queries that you typically run during normal database operations.
l Design tables that contain sample data.
Output
Database Designer yields the following output:
l A design script that creates the projections for the design in a way that meets the
optimization objectives and distributes data uniformly across the cluster.
l A deployment script that creates and refreshes the projections for your design. For
comprehensive designs, the deployment script contains commands that remove non-
optimized projections. The deployment script includes the full design script.
l A backup script that contains SQL statements to deploy the design that existed on the
system before deployment. This file is useful in case you need to revert to the pre-
deployment design.
Design Restrictions
Database Designer-generated designs:
l Exclude live aggregate or Top-K projections. You must create these manually. See
CREATE PROJECTION (Live Aggregate Projections).

l Do not sort, segment, or partition projections on LONG VARBINARY and LONG
VARCHAR columns.
Post-Design Options
While running Database Designer, you can choose to deploy your design automatically
after the deployment script is created, or to deploy it manually, after you have reviewed
and tested the design. Vertica recommends that you test the design on a non-production
server before deploying the design to your production server.
How Database Designer Creates a Design
Design Recommendations
Database Designer-generated designs can include the following recommendations:
l Sort buddy projections in the same order, which can significantly improve load,
recovery, and site node performance. All buddy projections have the same base
name so that they can be identified as a group.
Note: If you manually create projections, Database Designer recommends a
buddy with the same sort order, if one does not already exist. By default,
Database Designer recommends both super and non-super segmented
projections with a buddy of the same sort order and segmentation.
l Accepts unlimited queries for a comprehensive design.
l Allows you to analyze column correlations. Correlation analysis typically only needs
to be performed once and only if the table has more than
DBDCorrelationSampleRowCount (default: 4000) rows.
By default, Database Designer does not analyze column correlations. To set the
correlation analysis mode, use DESIGNER_SET_ANALYZE_CORRELATIONS_
MODE
l Identifies similar design queries and assigns them a signature.
For queries with the same signature, Database Designer weights the queries,
depending on how many queries have that signature. It then considers the weighted
query when creating a design.

l Recommends and creates projections in a way that minimizes data skew by
distributing data uniformly across the cluster.
l Produces higher quality designs by considering UPDATE, DELETE, and
SELECT statements.
Who Can Run Database Designer
To use Administration Tools to run Database Designer and create an optimal database
design, you must be a DBADMIN user.
To run Database Designer programmatically or using Management Console, you must
be one of two types of users:
l DBADMIN user
l Have been assigned the DBDUSER role and be the owner of database tables for
which you are creating a design
Granting and Enabling the DBDUSER Role
For a non-DBADMIN user to be able to run Database Designer using Management
Console, follow the steps described in Allowing the DBDUSER to Run Database
Designer Using Management Console.
For a non-DBADMIN user to be able to run Database Designer programmatically,
following the steps described in Allowing the DBDUSER to Run Database Designer
Programmatically.
Important: When you grant the DBDUSER role, make sure to associate a resource
pool with that user to manage resources during Database Designer runs. (For
instructions about how to associate a resource pool with a user, see User Profiles.)
Multiple users can run Database Designer concurrently without interfering with each
other or using up all the cluster resources. When a user runs Database Designer,
either using the Management Console or programmatically, its execution is mostly
contained by the user's resource pool, but may spill over into system resource pools
for less-intensive tasks.
Allowing the DBDUSER to Run Database Designer Using Management Console
To allow a user with the DBDUSER role to run Database Designer using Management
Console, you first need to create the user on the Vertica server.

As DBADMIN, take these steps on the server:
1. Add a temporary folder to all cluster nodes.
=> CREATE LOCATION '/tmp/dbd' ALL NODES;
2. Create the user who needs access to Database Designer.
=> CREATE USER new_user;
3. Grant the user the privilege to create schemas on the database for which they want
to create a design.
=> GRANT CREATE ON DATABASE new_database TO new_user;
4. Grant the DBDUSER role to the new user.
=> GRANT DBDUSER TO new_user;
5. On all nodes in the cluster, grant the user access to the temporary folder.
=> GRANT ALL ON LOCATION '/tmp/dbd' TO new_user;
6. Grant the new user access to the database schema and its tables.
=> GRANT ALL ON SCHEMA user_schema TO new_user;
=> GRANT ALL ON ALL TABLES IN SCHEMA user_schema TO new_user;
After you have completed this task, you need to do the following to map the MC user to
the new_user you created in the previous steps:
1. Log in to Management Console as an MC Super user.
2. Click MC Settings.
3. Click User Management.
4. To create a new MC user, click Add.To use an existing MC user, select the user
and click Edit.
5. Next to the DB access level window, click Add.

6. In the Add Permissions window, do the following:
a. From the Choose a database drop-down list, select the database for which you
want the user to be able to create a design.
b. In the Database username field, enter the user name you created on the Vertica
server, new_user in this example.
c. In the Database password field, enter the password for the database you
selected in step a.
d. In the Restrict access drop-down list, select the level of MC user you want for
this user.
7. Click OK to save your changes.
8. Log out of the MC Super user account.
The MC user is now mapped to the user that you created on the Vertica server. Log in
as the MC user and use Database Designer to create an optimized design for your
database.
For more information about mapping MC users, see Mapping an MC User to a Database
user's Privileges.
Allowing the DBDUSER to Run Database Designer Programmatically
To allow a user with the DBDUSER role to run Database Designer programmatically,
take these steps:
1. The DBADMIN user must grant the DBDUSER role:
=> GRANT DBDUSER TO <username>;
This role persists until the DBADMIN user revokes it.
2. For a non-DBADMIN user to run the Database Designer programmatically or using
Management Console, one of the following two steps must happen first:
n If the user's default role is already DBDUSER, skip this step. Otherwise, The user
must enable the DBDUSER role:

=> SET ROLE DBDUSER;
n The DBADMIN must add DBDUSER as the default role for that user:
=> ALTER USER <username> DEFAULT ROLE DBDUSER;
DBDUSER Capabilities and Limitations
The DBDUSER role has the following capabilities and limitations:
l A DBDUSER cannot create a design with a K-safety less than the system K-safety. If
the designs violate the current K-safet by not having enough buddy projections for the
tables, the design does not complete.
l A DBDUSER cannot explicitly change the ancient history mark (AHM), even during
deployment of their design.
When you create a design, you automatically have privileges to manipulate the design.
Other tasks may require that the DBDUSER have additional privileges:
To... DBDUSER must have...
Add design tables l USAGE privilege on the design table schema
l OWNER privilege on the design table
Add a single design query l Privilege to execute the design query
Add a file of design queries l Read privilege on the storage location that
contains the query file
l Privilege to execute all the queries in the file
Add design queries from the
result of a user query
l Privilege to execute the user query
l Privilege to execute each design query retrieved
from the results of the user query
Create the design and
deployment scripts
l WRITE privilege on the storage location of the
design script

deployment script
Workflow for Running Database Designer
Vertica provides three ways to run Database Designer:
l Using Management Console to Create a Design
l Using Administration Tools to Create a Design
l About Running Database Designer Programmatically
The following workflow is common to all these ways to run Database Designer:

Logging Projection Data for Database Designer
When you run Database Designer, the Optimizer proposes a set of ideal projections
based on the options that you specify. When you deploy the design, Database Designer
creates the design based on these projections. However, space or budget constraints

may prevent Database Designer from creating all the proposed projections. In addition,
Database Designer may not be able to implement the projections using ideal criteria.
To get information about the projections, first enable the Database Designer logging
capability. When enabled, Database Designer stores information about the proposed
projections in two Data Collector tables. After Database Designer deploys the design,
these logs contain information about which proposed projections were actually created.
After deployment, the logs contain information about:
l Projections that the Optimizer proposed
l Projections that Database Designer actually created when the design was deployed
l Projections that Database Designer created, but not with the ideal criteria that the
Optimizer identified.
l The DDL used to create all the projections
l Column optimizations
If you do not deploy the design immediately, review the log to determine if you want to
make any changes. If the design has been deployed, you can still manually create some
of the projections that Database Designer did not create.
To enable the Database Designer logging capability, see Enabling Logging for
Database Designer
To view the logged information, see Viewing Database Designer Logs.
Enabling Logging for Database Designer
By default, Database Designer does not log information about the projections that
the Optimizer proposed and the Database Designer deploys.
To enable Database Designer logging, enter the following command:
=> ALTER DATABASE mydb SET DBDLogInternalDesignProcess = 1;
To disable Database Designer logging, enter the following command:
=> ALTER DATABASE mydb SET DBDLogInternalDesignProcess = 0;
For more information about logging, see:

l Logging Projection Data for Database Designer
l Viewing Database Designer Logs
Viewing Database Designer Logs
You can find data about the projections that Database Designer considered and
deployed in two Data Collector tables:
l DC_DESIGN_PROJECTION_CANDIDATES
l DC_DESIGN_QUERY_PROJECTION_CANDIDATES
DC_DESIGN_PROJECTION_CANDIDATES
The DC_DESIGN_PROJECTION_CANDIDATES table contains information about all
the projections that the Optimizer proposed. This table also includes the DDL that
creates them. The is_a_winner field indicates if that projection was part of the actual
deployed design. To view the DC_DESIGN_PROJECTION_CANDIDATES table,
enter:
=> SELECT * FROM DC_DESIGN_PROJECTION_CANDIDATES;
DC_DESIGN_QUERY_PROJECTION_CANDIDATES
The DC_DESIGN_QUERY_PROJECTION_CANDIDATES table lists plan features for
all design queries.
Possible features are:
l FULLY DISTRIBUTED JOIN
l MERGE JOIN
l GROUPBY PIPE
l FULLY DISTRIBUTED GROUPBY
l RLE PREDICATE
l VALUE INDEX PREDICATE
l LATE MATERIALIZATION

For all design queries, the DC_DESIGN_QUERY_PROJECTION_CANDIDATES table
includes the following plan feature information:
l Optimizer path cost.
l Database Designer benefits.
l Ideal plan feature and its description, which identifies how the referenced projection
should be optimized.
l If the design was deployed, the actual plan feature and its description is included in
the table. This information identifies how the referenced projection was actually
optimized.
Because most projections have multiple optimizations, each projection usually has
multiple rows.To view the DC_DESIGN_QUERY_PROJECTION_CANDIDATES table,
enter:
=> SELECT * FROM DC_DESIGN_QUERY_PROJECTION_CANDIDATES;
To see example data from these tables, see Database Designer Logs: Example Data.
Database Designer Logs: Example Data
In the following example, Database Designer created the logs after creating a
comprehensive design for the VMart sample database. The output shows two records
from the DC_DESIGN_PROJECTION_CANDIDATES table.
The first record contains information about the customer_dimension_dbd_1_sort_
$customer_gender$__$annual_income$ projection. The record includes the
CREATE PROJECTION statement that Database Designer used to create the
projection. The is_a_winner column is t, indicating that Database Designer created
this projection when it deployed the design.
The second record contains information about the product_dimension_dbd_2_sort_
$product_version$__$product_key$ projection. For this projection, the is_a_winner
column is f. The Optimizer recommended that Database Designer create this projection
as part of the design. However, Database Designer did not create the projection when it
deployed the design. The log includes the DDL for the CREATE PROJECTION
statement. If you want to add the projection manually, you can use that DDL. For more
information, see Creating a Design Manually.

=> SELECT * FROM dc_design_projection_candidates;
-[ RECORD 1 ]--------+---------------------------------------------------------------
time | 2014-04-11 06:30:17.918764-07
node_name | v_vmart_node0001
session_id | localhost.localdoma-931:0x1b7
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_table_id | 45035996273720620
projection_id | 45035996273726626
iteration_number | 1
projection_name | customer_dimension_dbd_1_sort_$customer_gender$__$annual_income$
projection_statement | CREATE PROJECTION v_dbd_sarahtest_sarahtest."customer_dimension_dbd_1_
sort_$customer_gender$__$annual_income$"
(
customer_key ENCODING AUTO,
customer_type ENCODING AUTO,
customer_name ENCODING AUTO,
customer_gender ENCODING RLE,
title ENCODING AUTO,
household_id ENCODING AUTO,
customer_address ENCODING AUTO,
customer_city ENCODING AUTO,
customer_state ENCODING AUTO,
customer_region ENCODING AUTO,
marital_status ENCODING AUTO,
customer_age ENCODING AUTO,
number_of_children ENCODING AUTO,
annual_income ENCODING AUTO,
occupation ENCODING AUTO,
largest_bill_amount ENCODING AUTO,
store_membership_card ENCODING AUTO,
customer_since ENCODING AUTO,
deal_stage ENCODING AUTO,
deal_size ENCODING AUTO,
last_deal_update ENCODING AUTO
)
AS
SELECT customer_key,
customer_type,
customer_name,
customer_gender,
title,
household_id,
customer_address,
customer_city,
customer_state,
customer_region,
marital_status,
customer_age,
number_of_children,
annual_income,
occupation,
largest_bill_amount,
store_membership_card,
customer_since,
deal_stage,
deal_size,
last_deal_update

ORDER BY customer_gender,
annual_income
UNSEGMENTED ALL NODES;
is_a_winner | t
-[ RECORD 2 ]--------+-------------------------------------------------------------
time | 2014-04-11 06:30:17.961324-07
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_table_id | 45035996273720624
projection_id | 45035996273726714
projection_name | product_dimension_dbd_2_sort_$product_version$__$product_key$
projection_statement | CREATE PROJECTION v_dbd_sarahtest_sarahtest."product_dimension_dbd_2_
sort_$product_version$__$product_key$"
(
product_key ENCODING AUTO,
product_version ENCODING RLE,
product_description ENCODING AUTO,
sku_number ENCODING AUTO,
category_description ENCODING AUTO,
department_description ENCODING AUTO,
package_type_description ENCODING AUTO,
package_size ENCODING AUTO,
fat_content ENCODING AUTO,
diet_type ENCODING AUTO,
weight ENCODING AUTO,
weight_units_of_measure ENCODING AUTO,
shelf_width ENCODING AUTO,
shelf_height ENCODING AUTO,
shelf_depth ENCODING AUTO,
product_price ENCODING AUTO,
product_cost ENCODING AUTO,
lowest_competitor_price ENCODING AUTO,
highest_competitor_price ENCODING AUTO,
average_competitor_price ENCODING AUTO,
discontinued_flag ENCODING AUTO
)
AS
SELECT product_key,
product_version,
product_description,
sku_number,
category_description,
department_description,
package_type_description,
package_size,
fat_content,
diet_type,
weight,
weight_units_of_measure,
shelf_width,
shelf_height,
shelf_depth,
product_price,
product_cost,
lowest_competitor_price,
highest_competitor_price,

average_competitor_price,
discontinued_flag
FROM public.product_dimension
ORDER BY product_version,
product_key
is_a_winner | f
.
.
.
The next example shows the contents of two records in the DC_DESIGN_QUERY_
PROJECTION_CANDIDATES. Both of these rows apply to projection id
45035996273726626.
In the first record, the Optimizer recommends that Database Designer optimize the
customer_gender column for the GROUPBY PIPE algorithm.
In the second record, the Optimizer recommends that Database Designer optimize the
public.customer_dimension table for late materialization. Late materialization can
improve the performance of joins that might spill to disk.
=> SELECT * FROM dc_design_query_projection_candidates;
-[ RECORD 1 ]-----------------+------------------------------------------------------------
time | 2014-04-11 06:30:17.482377-07
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_query_id | 3
design_table_id | 45035996273720620
projection_id | 45035996273726626
ideal_plan_feature | GROUP BY PIPE
ideal_plan_feature_description | Group-by pipelined on column(s) customer_gender
dbd_benefits | 5
opt_path_cost | 211
-[ RECORD 2 ]-----------------+------------------------------------------------------------
time | 2014-04-11 06:30:17.48276-07
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_query_id | 3
design_table_id | 45035996273720620
projection_id | 45035996273726626
ideal_plan_feature | LATE MATERIALIZATION
ideal_plan_feature_description | Late materialization on table public.customer_dimension
dbd_benefits | 4
opt_path_cost | 669
.

.
.
You can view the actual plan features that Database Designer implemented for the
projections it created. To do so, query the V_INTERNAL.DC_DESIGN_QUERY_
PROJECTIONS table:
=> select * from v_internal.dc_design_query_projections;
-[ RECORD 1 ]-------------------+-------------------------------------------------------------
time | 2014-04-11 06:31:41.19199-07
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_query_id | 1
projection_id | 2
design_table_id | 45035996273720624
actual_plan_feature | RLE PREDICATE
actual_plan_feature_description | RLE on predicate column(s) department_description
dbd_benefits | 2
opt_path_cost | 141
-[ RECORD 2 ]-------------------+-------------------------------------------------------------
time | 2014-04-11 06:31:41.192292-07
user_id | 45035996273704962
user_name | dbadmin
design_id | 45035996273705182
design_query_id | 1
projection_id | 2
design_table_id | 45035996273720624
actual_plan_feature | GROUP BY PIPE
actual_plan_feature_description | Group-by pipelined on column(s) fat_content
dbd_benefits | 5
opt_path_cost | 155
Specifying Parameters for Database Designer
Before you run Database Designer to create a design, provide information that allows
Database Designer to create the optimal physical schema:
l Design Name
l Design Types
l Optimization Objectives
l Design Tables with Sample Data
l Design Queries

l K-safety
l Replicated and Segmented Projections
l Statistics Analysis
Design Name
All designs that Database Designer creates must have a name that you specify. The
design name must be alphanumeric or underscore (_) characters, and can be no more
than 32 characters long. (Administrative Tools and Management Console limit the
design name to 16 characters.)
The design name becomes part of the files that Database Designer generates, including
the deployment script, allowing the files to be easily associated with a particular
Database Designer run.
Design Types
The Database Designer can create two distinct design types. The design you choose
depends on what you are trying to accomplish:
l Comprehensive Design
l Incremental Design
Comprehensive Design
A comprehensive design creates an initial or replacement design for all the tables in the
specified schemas. Create a comprehensive design when you are creating a new
database.
To help Database Designer create an efficient design, load representative data into the
tables before you begin the design process. When you load data into a table, Vertica
creates an unoptimized superprojection so that Database Designer has projections to
optimize. If a table has no data, Database Designer cannot optimize it.
Optionally, supply Database Designer with representative queries that you plan to use
so Database Designer can optimize the design for them. If you do not supply any
queries, Database Designer creates a generic optimization of the superprojections that
minimizes storage, with no query-specific projections.
During a comprehensive design, Database Designer creates deployment scripts that:

l Create new projections to optimize query performance, only when they do not
already exist.
l Create replacement buddy projections when Database Designer changes the
encoding of pre-existing projections that it has decided to keep.
Incremental Design
An incremental design creates an enhanced design with additional projections, if
required, that are optimized specifically for the queries that you provide. Create an
incremental design when you have one or more queries that you want to optimize.
Optimization Objectives
When creating a design, Database Designer can optimize the design for one of three
objectives:
l Load Database Designer creates a design that is optimized for loads, minimizing
database size, potentially at the expense of query performance.
l Performance Database Designer creates a design that is optimized for fast query
performance. Because it recommends a design for optimized query performance, this
design might recommend more than the Load or Balanced objectives, potentially
resulting in a larger database storage size.
NOTE: A fully optimized query has an optimization ratio of 0.99. Optimization ratio is
the ratio of a query's benefits achieved in the design produced by the Database
Designer to that achieved in the ideal plan. Check the optimization ratio with the
OptRatio parameter in designer.log.
l Balanced Database Designer creates a design whose objectives are balanced
between database size and query performance.
Design Tables with Sample Data
You must specify one or more design tables for Database Designer to deploy a design.
If your schema is empty, it does not appear as a design table option.
When you specify design tables, consider the following:
l To create the most efficient projections for your database, load a moderate amount of
representative data into tables before running Database Designer. Database

Designer considers the data in this table when creating the design.
l If your design tables have a large amount if data, the Database Designer run takes a
long time; if your tables have too little data, the design is not optimized. Vertica
recommends that 10 GB of sample data is sufficient for creating an optimal design.
l If you submit a design table with no data, Database Designer ignores it.
l If one of your design tables has been dropped, you will not be able to build or deploy
your design.
Design Queries
If you supply representative queries that you run on your database to Database
Designer, it optimizes the performance of those queries.
Database Designer checks the validity of all queries when you add them to your design
and again when it builds the design. If a query is invalid, Database Designer ignores it.
The query file can contain up to 100 queries. Each query can be assigned a weight that
indicates its relative importance so that Database Designer can prioritize it when
creating the design. Database Designer groups queries that affect the design that
Database Designer creates in the same way and considers one weighted query when
creating a design.
The following options apply, depending on whether you create an incremental or
comprehensive design:
l Design queries are required for incremental designs.
l Design queries are optional for comprehensive designs. If you do not provide design
queries, Database Designer recommends a generic design that does not consider
specific queries.
Query Repository
Using Management Console, you can submit design queries from the QUERY_
REQUESTS system table. This is called the query repository.
The QUERY_REQUESTS table contains queries that users have run recently. For a
comprehensive design, you can submit up to 200 queries from the QUERY_
REQUESTS table to Database Designer to be considered when creating the design.

For an incremental design, you can submit up to 100 queries from the QUERY_
REQUESTS table.
Replicated and Segmented Projections
When creating a comprehensive design, Database Designer creates projections based
on data statistics and queries. It also reviews the submitted design tables to decide
whether projections should be segmented (distributed across the cluster nodes) or
replicated (duplicated on all cluster nodes).
For detailed information, see the following sections:
l Replicated Projections
l Segmented Projections
Replicated Projections
Replication occurs when Vertica stores identical copies of data across all nodes in a
cluster.
If you are running on a single-node database, all projections are replicated because
segmentation is not possible in a single-node database.
Assuming that largest-row-count equals the number of rows in the design table with the
largest number of rows, Database Designer recommends that a projection be replicated
if any of the following conditions is true:
l largest-row-count < 1,000,000 and number of rows in the table <= 10% of largest-row-
count
l largest-row-count >= 10,000,000 and number of rows in the table <= 1% of largest-
row-count
l The number of rows in the table <= 100,000
For more information about replication, see High Availability With Projections in Vertica
Concepts.
Segmented Projections
Segmentation occurs when Vertica distributes data evenly across multiple database
nodes so that all nodes participate in query execution. Projection segmentation provides
high availability and recovery, and optimizes query execution.
When running Database Designer programmatically or using Management Console,
you can specify to allow Database Designer to recommend unsegmented projections in

the design. If you do not specify this, Database Designer recommends only segmented
projections.
Database Designer recommends segmented superprojections for large tables when
deploying to multiple node clusters, and recommends replicated superprojections for
smaller tables.
Database Designer does not segment projections on:
l Single-node clusters
l LONG VARCHAR and LONG VARBINARY columns
For more information about segmentation, see High Availability With Projections in
Vertica Concepts.
Statistics Analysis
By default, Database Designer analyzes statistics for the design tables when adding
them to the design. This option is optional, but Vertica recommends that you analyze
statistics because accurate statistics help Database Designer optimize compression
and query performance.
Analyzing statistics takes time and resources. If the current statistics for the design
tables are up to date, do not bother analyzing the statistics. When in doubt, analyze the
statistics to make sure they are current.
For more information, see Collecting Statistics.
Building a Design
After you have created design tables and loaded data into them, and then specified the
parameters you want Database Designer to use when creating the physical schema,
direct Database Designer to create the scripts necessary to build the design.
Note: You cannot stop a running database if Database Designer is building a
database design.
When you build a database design, Vertica generates two scripts:
l Deployment script—<design_name>_deploy.sql—Contains the SQL statements
that create projections for the design you are deploying, deploy the design, and drop
unused projections. When the deployment script runs, it creates the optimized

design. For details about how to run this script and deploy the design, see Deploying
a Design.
l Design script—<design_name>_design.sql—Contains the
CREATE PROJECTION statements that Database Designeruses to create the
design. Review this script to make sure you are happy with the design.
The design script is a subset of the deployment script. It serves as a backup of the
DDL for the projections that the deployment script creates.
If you run Database Designer from Administrative Tools, Vertica also creates a backup
script named <design_name>_projection_backup_<unique id #>.sql. This script
contains SQL statements to deploy the design that existed on the system before
deployment. This file is useful in case you need to revert to the pre-deployment design.
When you create a design using Management Console:
l If you submit a large number of queries to your design and build it right immediately,
a timing issue could cause the queries not to load before deployment starts. If this
occurs, you may see one of the following errors:
n No queries to optimize for
n No tables to design projections for
To accommodate this timing issue, you may need to reset the design, check the
Queries tab to make sure the queries have been loaded, and then rebuild the design.
Detailed instructions are in:
n Using the Wizard to Create a Design
n Creating a Design Manually
l The scripts are deleted when deployment completes. To save a copy of the
deployment script after the design is built but before the deployment completes, go to
the Output window and copy and paste the SQL statements to a file.
Resetting a Design
You must reset a design when:

l You build a design and the output scripts described in Building a Design are not
created.
l You build a design but Database Designer cannot complete the design because the
queries it expects are not loaded.
Resetting a design discards all the run-specific information of the previous Database
Designer build, but retains its configuration (design type, optimization objectives, K-
safety, etc.) and tables and queries.
After you reset a design, review the design to see what changes you need to make. For
example, you can fix errors, change parameters, or check for and add additional tables
or queries. Then you can rebuild the design.
You can only reset a design in Management Console or by using the DESIGNER_
RESET_DESIGN function.

Deploying a Design
After running Database Designer to generate a deployment script, Vertica recommends
that you test your design on a non-production server before you deploy it to your
production server.
Both the design and deployment processes run in the background. This is useful if you
have a large design that you want to run overnight. Because an active SSH session is
not required, the design/deploy operations continue to run uninterrupted, even if the
session is terminated.
Note: You cannot stop a running database if Database Designer is building or
deploying a database design.
Database Designer runs as a background process. Multiple users can run Database
Designer concurrently without interfering with each other or using up all the cluster
resources. However, if multiple users are deploying a design on the same tables at the
same time, Database Designer may not be able to complete the deployment. To avoid
problems, consider the following:
l Schedule potentially conflicting Database Designer processes to run sequentially
overnight so that there are no concurrency problems.
l Avoid scheduling Database Designer runs on the same set of tables at the same
time.
There are two ways to deploy your design:
l Deploying Designs Using Database Designer
l Deploying Designs Manually
Deploying Designs Using Database Designer
HPE recommends that you run Database Designer and deploy optimized projections
right after loading your tables with sample data because Database Designer provides
projections optimized for the current state of your database.
If you choose to allow Database Designer to automatically deploy your script during a
comprehensive design and are running Administrative Tools, Database Designer
creates a backup script of your database's current design. This script helps you re-

create the design of projections that may have been dropped by the new design. The
backup script is located in the output directory you specified during the design process.
If you choose not to have Database Designer automatically run the deployment script
(for example, if you want to maintain projections from a pre-existing deployment), you
can manually run the deployment script later. See Deploying Designs Manually.
To deploy a design while running Database Designer, do one of the following:
l In Management Console, select the design and click Deploy Design.
l In the Administration Tools, select Deploy design in the Design Options window.
If you are running Database Designer programmatically, use DESIGNER_RUN_
POPULATE_DESIGN_AND_DEPLOY and set the deploy parameter to 'true'.
Once you have deployed your design, query the DEPLOY_STATUS system table to
see the steps that the deployment took:
vmartdb=> SELECT * FROM V_MONITOR.DEPLOY_STATUS;
Deploying Designs Manually
If you chose not to have Database Designer deploy your design at design time, you can
deploy the design later using the deployment script:
1. Make sure that you have a database that contains the same tables and projections
as the database on which you ran Database Designer. The database should also
contain sample data.
2. To deploy the projections to a test or production environment, use the following vsql
command to execute the deployment script, where <design_name> is the name of
the database design:
=> i <design_name>_deploy.sql
How to Create a Design
There are three ways to create a design using Database Designer:
l From Management Console, open a database and select the Design page at the
bottom of the window.

For details about using Management Console to create a design, see Using
Management Console to Create a Design
l Programmatically, using the techniques described in About Running Database
Designer Programmatically in Analyzing Data. To run Database Designer
programmatically, you must be a DBADMIN or have been granted the DBDUSER
role and enabled that role.
l From the Administration Tools menu, by selecting Configuration Menu > Run
Database Designer. You must be a DBADMIN user to run Database Designer from
the Administration Tools.
For details about using Administration Tools to create a design, see Using
Administration Tools to Create a Design.
The following table shows what Database Designer capabilities are available in each
tool:
Database Designer
Capability
Management
Console
Running
Database
Designer
Programmatically
Administrative
Tools
Create design Yes Yes Yes
Design name length
(# of characters)
16 32 16
Build design (create
design and
deployment scripts)
Yes Yes Yes
Create backup script Yes
Set design type
(comprehensive or
incremental)
Yes Yes Yes
Set optimization
objective
Yes Yes Yes

Database Designer
Capability
Management
Console
Running
Database
Designer
Programmatically
Administrative
Tools
Add design tables Yes Yes Yes
Add design queries
file
Yes Yes Yes
Add single design
query
Yes
Use query repository Yes Yes
Set K-safety Yes Yes Yes
Analyze statistics Yes Yes Yes
Require all
unsegmented
projections
Yes Yes
View event history Yes Yes
Set correlation
analysis mode
(Default = 0)
Yes
Using Management Console to Create a Design
To use Management Console to create an optimized design for your database, you must
be a DBADMIN user or have been assigned the DBDUSER role.
Management Console provides two ways to create a design
l Wizard—This option walks you through the process of configuring a new design.
Click Back and Next to navigate through the Wizard steps, or Cancel to cancel
creating a new design.
To learn how to use the Wizard to create a design, see Using the Wizard to Create a
Design.

l Manual—This option creates and saves a design with the default parameters.
To learn how to create a design manually, see Creating a Design Manually
Tip: If you have many design tables that you want Database Designer to consider, it
might be easier to use the Wizard to create your design. In the Wizard, you can
submit all the tables in a schema at once; creating a design manually requires that
you submit the design tables one at a time.
Using the Wizard to Create a Design
Take these steps to create a design using the Management Console's Wizard:
1. Log in to Management Console, select and start your database, and click Design at
the bottom of the window. The Database Designer window appears. If there are no
existing designs, the New Design window appears.
The left side of the Database Designer window lists the database designs you own,
with the most recent design you worked on selected. That pane also lists the current
status of the design.
The main pane contains details about the selected design.

2. To create a new design, click New Design.
3. Enter a name for your design, and click Wizard.
For more information, see Design Name.
4. Navigate through the Wizard using the Back and Next buttons.
5. To build the design immediately after exiting the Wizard, on the Execution Options
window, select Auto-build.
Important: Hewlett Packard Enterprise does not recommend that you auto-
deploy the design from the Wizard. There may be a delay in adding the queries

to the design, so if the design is deployed but the queries have not yet loaded,
deployment may fail. If this happens, reset the design, check the Queries tab to
make sure the queries have been loaded, and deploy the design.
6. When you have entered all the information, the Wizard displays a summary of your
choices. Click Submit Design to build your design.
Creating a Design Manually
To create a design using Management Console and specify the configuration, take
these steps.
1. Log in to Management Console, select and start your database, and click Design at
the bottom of the window. The Database Designer window appears.

The left side of the Database Designer window lists the database designs you own,
with the most recent design you worked on highlighted. That pane also lists the
current status of the design. Details about the most recent design appear in the main
pane.
The main pane contains details about the selected design.
2. To create a new design, click New Design.
3. Enter a name for your design and select Manual.

After a few seconds, the main Database Design window opens, displaying the
default design parameters. Vertica has created and saved a design with the name
you specified, and assigned it the default parameters.
For more information, see Design Name.
4. On the General window, modify the design type, optimization objectives, K-safety,
Analyze Correlations Mode, and the setting that allows Database Designer to create
unsegmented projections.
If you choose Incremental, the design automatically optimizes for the desired
queries, and the K-safety defaults to the value of the cluster K-safety; you cannot
change these values for an incremental design.
Analyze Correlations Mode determines if Database Designer analyzes and
considers column correlations when creating the design. For more information, see
DESIGNER_SET_ANALYZE_CORRELATIONS_MODE.
5. Click the Tables tab. You must submit tables to your design.
6. To add tables of sample data to your design, click Add Tables. A list of available
tables appears; select the tables you want and click Save. If you want to remove
tables from your design, click the tables you want to remove, and click Remove
Selected.
If a design table has been dropped from the database, a red circle with a white
exclamation point appears next to the table name. Before you can build or deploy
the design, you must remove any dropped tables from the design. To do this, select
the dropped tables and and click Remove Selected. You cannot build or deploy a
design if any of the design tables have been dropped.
7. Click the Queries tab. To add queries to your design, do one of the following:
n To add queries from the QUERY_REQUESTS system table, click Query
Repository, select the desired queries and click Save. All valid queries that you
selected appear in the Queries window.

n To add queries from a file, select Choose File. All valid queries in the file that you
select are added to the design and appear in the Queries window.
Database Designer checks the validity of the queries when you add the queries to
the design and again when you build the design. If it finds invalid queries, it ignores
them.
If you have a large number of queries, it may take time to load them. Make sure that
all the queries you want Database Designer to consider when creating the design
are listed in the Queries window.
8. Once you have specified all the parameters for your design, you should build the
design. To do this, select your design and click Build Design.
9. Select Analyze Statistics if you want Database Designer to analyze the statistics
before building the design.
For more information see Statistics Analysis.
10. If you do not need to review the design before deploying it, select Deploy
Immediately. Otherwise, leave that option unselected.
11. Click Start. On the left-hand pane, the status of your design displays as Building
until it is complete.
12. To follow the progress of a build, click Event History. Status messages appear in
this window and you can see the current phase of the build operation. The
information in the Event History tab contains data from theOUTPUT_EVENT_
HISTORY system table.
13. When the build completes, the left-hand pane displays Built. To view the
deployment script, select your design and click Output.
14. After you deploy the design using Management Console, the deployment script is
deleted. To keep a permanent copy of the deployment script, copy and paste the
SQL commands from the Output window to a file.

15. Once you have reviewed your design and are ready to deploy it, select the design
and click Deploy Design.
16. To follow the progress of the deployment, click Event History. Status messages
appear in this window and you can see the current phase of the deployment
operation.
In the Event History window, while the design is running, you can do one of the
following:
n Click the blue button next to the design name to refresh the event history listing.
n Click Cancel Design Run to cancel the design in progress.
n Click Force Delete Design to cancel and delete the design in progress.
17. When the deployment completes, the left-hand pane displays Deployment
Completed. To view the deployment script, select your design and click Output.
Your database is now optimized according to the parameters you set.
Using Administration Tools to Create a Design
To use the Administration Tools interface to create an optimized design for your
database, you must be a DBADMIN user. Follow these steps:
1. Log in as the dbadmin user and start Administration Tools.
2. From the main menu, start the database for which you want to create a design. The
database must be running before you can create a design for it.
3. On the main menu, select Configuration Menu and click OK.
4. On the Configuration Menu, select Run Database Designer and click OK.
5. On the Select a database to design window, enter the name of the database for
which you are creating a design and click OK.
6. On the Enter the directory for Database Designer output window, enter the full
path to the directory to contain the design script, deployment script, backup script,

and log files, and click OK.
For information about the scripts, see Building a Design.
7. On the Database Designer window, enter a name for the design and click OK.
For more information about design names, see Design Name.
8. On the Design Type window, choose which type of design to create and click OK.
For a description of the design types, see Design Types
9. The Select schema(s) to add to query search path window lists all the schemas
in the database that you selected. Select the schemas that contain representative
data that you want Database Designer to consider when creating the design and
click OK.
For more information about choosing schema and tables to submit to Database
Designer, see Design Tables with Sample Data.
10. On the Optimization Objectives window, select the objective you want for the
database optimization:
n Optimize with Queries
For more information, see Design Queries.
n Update statistics
For more information see Statistics Analysis.
n Deploy design
For more information, see Deploying a Design.
For details about these objectives, see Optimization Objectives.
11. The final window summarizes the choices you have made and offers you two
choices:
n Proceed with building the design, and deploying it if you specified to deploy it
immediately. If you did not specify to deploy, you can review the design and

deployment scripts and deploy them manually, as described in Deploying
Designs Manually.
n Cancel the design and go back to change some of the parameters as needed.
12. Creating a design can take a long time.To cancel a running design from the
Administration Tools window, enter Ctrl+C.
To create a design for the VMart example database, see Using Database Designer to
Create a Comprehensive Design in Getting Started.

About Running Database Designer Programmatically
If you are granted the DBDUSER role and enable the role, you can access Database
Designer functionality programmatically. Using the DESIGNER_* command-line
functions, you can perform the following Database Designer tasks:
l Create a comprehensive or incremental design.
l Add tables and queries to the design.
l Set the optimization objective to prioritize for query performance or storage footprint.
l Weight queries.
l Set K-safety for a design.
l Analyze statistics on the design tables.
l Create the script with DDL statements that create design projections.
l Deploy the database design.
l Specify that all projections in the design be segmented.
l Populate the design.
l Cancel a running design.
l Wait for a running design to complete.
l Deploy a design automatically.
l Drop database objects from one or more completed or terminated designs.
For information about each function, see Database Designer Functions in the SQL
Reference Manual.
DBUSER Resource Pool
When you grant the DBDUSER role, you must associate a resource pool with that user
to manage resources during Database Designer runs. Multiple users can run Database
Designer concurrently without interfering with each other or using up all cluster
resources. When a user runs Database Designer, with Administration Tools or

programmatically, execution is mostly contained by the user's resource pool, but might
spill over into some system resource pools for less-intensive tasks.
When to Run Database Designer Programmatically
Run Database Designer programmatically when you want to:
l Optimize performance on tables you own.
l Create or update a design without the involvement of the superuser.
l Add individual queries and tables, or add data to your design and then rerun
Database Designer to update the design based on this new information.
l Customize the design.
l Use recently executed queries to set up your database to run Database Designer
automatically on a regular basis.
l Assign each design query a query weight that indicates the importance of that query
in creating the design. Assign a higher weight to queries that you run frequently so
that Database Designer prioritizes those queries in creating the design.
Categories Database Designer Functions
You can run Database Designer functions in vsql:
Setup Functions
This function directs Database Designer to create a new design:
l DESIGNER_CREATE_DESIGN
Configuration Functions
The following functions allow you to specify properties of a particular design:
l DESIGNER_DESIGN_PROJECTION_ENCODINGS
l DESIGNER_SET_DESIGN_KSAFETY
l DESIGNER_SET_OPTIMIZATION_OBJECTIVE
l DESIGNER_SET_DESIGN_TYPE

l DESIGNER_SET_PROPOSED_UNSEGMENTED_PROJECTIONS
l DESIGNER_SET_ANALYZE_CORRELATIONS_MODE
Input Functions
The following functions allow you to add tables and queries to your Database Designer
design:
l DESIGNER_ADD_DESIGN_QUERIES
l DESIGNER_ADD_DESIGN_QUERIES_FROM RESULTS
l DESIGNER_ADD_DESIGN_QUERY
l DESIGNER_ADD_DESIGN_TABLES
Invocation Functions
These functions populate the Database Designer workspace and create design and
deployment scripts. You can also analyze statistics, deploy the design automatically,
and drop the workspace after the deployment:
l DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY
l DESIGNER_WAIT_FOR_DESIGN
Output Functions
The following functions display information about projections and scripts that the
Database Designer created:
l DESIGNER_OUTPUT_ALL_DESIGN_PROJECTIONS
l DESIGNER_OUTPUT_DEPLOYMENT_SCRIPT
Cleanup Functions
The following functions cancel any running Database Designer operation or drop a
Database Designer design and all its contents:
l DESIGNER_CANCEL_POPULATE_DESIGN
l DESIGNER_DROP_DESIGN
l DESIGNER_DROP_ALL_DESIGNS

Privileges for Running Database Designer Functions
If they have been granted the DBDUSER role, non-DBADMIN users can run Database
Designer using the functions described in Categories of Database Designer Functions.
Non-DBADMIN users cannot run Database Designer using Administration Tools, even
if they have been assigned the DBDUSER role.
To grant the DBDUSER role:
1. The DBADMIN user must grant the DBDUSER role:
=> GRANT DBDUSER TO <username>;
This role persists until the DBADMIN revokes it.
IMPORTANT: When you grant the DBDUSER role, make sure to associate a
resource pool with that user to manage resources during Database Designer runs.
Multiple users can run Database Designer concurrently without interfering with each
other or using up all the cluster resources. When a user runs Database Designer,
either using the Administration Tools or programmatically, its execution is mostly
contained by the user's resource pool, but may spill over into some system resource
pools for less-intensive tasks.
2. For a user to run the Database Designer functions, one of the following must
happen first:
n The user must enable the DBDUSER role:
=> SET ROLE DBDUSER;
n The superuser must add DBDUSER as the default role:
=> ALTER USER <username> DEFAULT ROLE DBDUSER;
DBDUSER Capabilities and Limitations
The DBDUSER role has the following capabilities and limitations:
l A DBDUSER can change K-safety for their own designs, but they cannot change the
system K-safety value. The DBDUSER can set the K-safety to a value less than or
equal to the system K-safety value, but is limited to a value of 0, 1, or 2.

l A DBDUSER cannot explicitly change the ancient history mark (AHM), even during
deployment of their design.
DBDUSER Privileges
When you create a design, you automatically have privileges to manipulate the design.
Other tasks may require that the DBDUSER have additional privileges:
Add tables to a design l USAGE privilege on the design table schema
l OWNER privilege on the design table
Add a single design query to the
design
l Privilege to execute the design query
Add a query file to the design l Read privilege on the storage location that
contains the query file
l Privilege to execute all the queries in the file
Add queries from the result of a
user query to the design
l Privilege to execute the user query
l Privilege to execute each design query
retrieved from the results of the user query
Create the design and
deployment scripts
design script
deployment script
Workflow for Running Database Designer Programmatically
The following example shows the steps you take to create a design by running
Database Designer programmatically.
Note: Be sure to back up the existing design using the EXPORT_CATALOG
function before running the Database Designer functions on an existing schema.
You must explicitly back up the current design when using Database Designer to
create a new comprehensive design.

Before you run this example, you should have the DBDUSER role, and you should have
enabled that role using the SET ROLE DBDUSER command:
1. Create a table in the public schema:
=> CREATE TABLE T(
x INT,
y INT,
z INT,
u INT,
v INT,
w INT PRIMARY KEY
);
2. Add data to the table:
! perl -e 'for ($i=0; $i<100000; ++$i) {printf("%d, %d, %d, %d, %d, %dn", $i/10000,
$i/100, $i/10, $i/2, $i, $i);}'
| vsql -c "COPY T FROM STDIN DELIMITER ',' DIRECT;"
3. Create a second table in the public schema:
=> CREATE TABLE T2(
x INT,
y INT,
z INT,
u INT,
v INT,
w INT PRIMARY KEY
);
4. Copy the data from table T1 to table T2 and commit the changes:
=> INSERT /*+DIRECT*/ INTO T2 SELECT * FROM T;
=> COMMIT;
5. Create a new design:
=> SELECT DESIGNER_CREATE_DESIGN('my_design');
This command adds information to the DESIGNS system table in the V_MONITOR
schema.
6. Add tables from the public schema to the design :

=> SELECT DESIGNER_ADD_DESIGN_TABLES('my_design', 'public.t');
=> SELECT DESIGNER_ADD_DESIGN_TABLES('my_design', 'public.t2');
These commands add information to the DESIGN_TABLES system table.
7. Create a file named queries.txt in /tmp/examples, or another directory where
you have READ and WRITE privileges. Add the following two queries in that file
and save it. Database Designer uses these queries to create the design:
SELECT DISTINCT T2.u FROM T JOIN T2 ON T.z=T2.z-1 WHERE T2.u > 0;
SELECT DISTINCT w FROM T;
8. Add the queries file to the design and display the results—the numbers of accepted
queries, non-design queries, and unoptimizable queries:
=> SELECT DESIGNER_ADD_DESIGN_QUERIES
('my_design',
'/tmp/examples/queries.txt',
'true'
);
The results show that both queries were accepted:
Number of accepted queries =2
Number of queries referencing non-design tables =0
Number of unsupported queries =0
Number of illegal queries =0
The DESIGNER_ADD_DESIGN_QUERIES function populates the DESIGN_
QUERIES system table.
9. Set the design type to comprehensive. (This is the default.) A comprehensive
design creates an initial or replacement design for all the design tables:
=> SELECT DESIGNER_SET_DESIGN_TYPE('my_design', 'comprehensive');
10. Set the optimization objective to query. This setting creates a design that focuses
on faster query performance, which might recommend additional projections. These
projections could result in a larger database storage footprint:
=> SELECT DESIGNER_SET_OPTIMIZATION_OBJECTIVE('my_design', 'query');

11. Create the design and save the design and deployment scripts in /tmp/examples,
or another directory where you have READ and WRITE privileges. The following
command:
n Analyzes statistics
n Doesn't deploy the design.
n Doesn't drop the design after deployment.
n Stops if it encounters an error.
=> SELECT DESIGNER_RUN_POPULATE_DESIGN_AND_DEPLOY
('my_design',
'/tmp/examples/my_design_projections.sql',
'/tmp/examples/my_design_deploy.sql',
'True',
'False',
'False',
'False'
);
This command adds information to the following system tables:
n DEPLOYMENT_PROJECTION_STATEMENTS
n DEPLOYMENT_PROJECTIONS
n OUTPUT_DEPLOYMENT_STATUS
12. Examine the status of the Database Designer run to see what projections Database
Designer recommends. In the deployment_projection_name column:
n rep indicates a replicated projection
n super indicates a superprojection
The deployment_status column is pending because the design has not yet
been deployed.
For this example, Database Designer recommends four projections:
=> x

Expanded display is on.
=> SELECT * FROM OUTPUT_DEPLOYMENT_STATUS;
-[ RECORD 1 ]--------------+-----------------------------
deployment_id | 45035996273795970
deployment_projection_id | 1
deployment_projection_name | T_DBD_1_rep_my_design
deployment_status | pending
error_message | N/A
-[ RECORD 2 ]--------------+-----------------------------
deployment_id | 45035996273795970
deployment_projection_name | T2_DBD_2_rep_my_design
error_message | N/A
-[ RECORD 3 ]--------------+-----------------------------
deployment_id | 45035996273795970
deployment_projection_name | T_super
error_message | N/A
-[ RECORD 4 ]--------------+-----------------------------
deployment_id | 45035996273795970
deployment_projection_name | T2_super
error_message | N/A
13. View the script /tmp/examples/my_design_deploy.sql to see how these
projections are created when you run the deployment script. In this example, the
script also assigns the encoding schemes RLE and COMMONDELTA_COMP to
columns where appropriate.
14. Deploy the design from the directory where you saved it:
=> i /tmp/examples/my_design_deploy.sql
15. Now that the design is deployed, delete the design:
=> SELECT DESIGNER_DROP_DESIGN('my_design');

Creating Custom Designs
HPE strongly recommends that you use the physical schema design produced by
Database Designer, which provides K-safety, excellent query performance, and efficient
use of storage space. If you find that any of your queries are not running as efficiently as
you would like, you can use the Database Designer incremental design process to
optimize the database design for the query.
If the projections created by Database Designer still do not meet your needs, you can
write custom projections, from scratch or based on projection designs created by
Database Designer.
If you are unfamiliar with writing custom projections, start by modifying an existing
design generated by Database Designer.
Custom Design Process
To create a custom design or customize an existing one:
1. Plan the new design or modifications to an existing one. See Planning Your Design.
2. Create or modify projections. See Design Fundamentalsand CREATE
PROJECTION for more detail.
3. Deploy projections to a test environment. See Writing and Deploying Custom
Projections.
4. Test and modify projections as needed.
5. After you finalize the design, deploy projections to the production environment.

Planning Your Design
The syntax for creating a design is easy for anyone who is familiar with SQL. As with
any successful project, however, a successful design requires some initial planning.
Before you create your first design:
l Become familiar with standard design requirements and plan your design to include
them. See Design Requirements.
l Determine how many projections you need to include in the design. See Determining
the Number of Projections to Use.
l Determine the type of compression and encoding to use for columns. See Data
Encoding and Compression.
l Determine whether or not you want the database to be K-safe. Vertica recommends
that all production databases have a minimum K-safety of one (K=1). Valid K-safety
values are 0, 1, and 2. See Designing for K-Safety.
Design Requirements
A physical schema design is a script that contains CREATE PROJECTION statements.
These statements determine which columns are included in projections and how they
are optimized.
If you use Database Designer as a starting point, it automatically creates designs that
meet all fundamental design requirements. If you intend to create or modify designs
manually, be aware that all designs must meet the following requirements:
l Every design must create at least one superprojection for every table in the database
that is used by the client application. These projections provide complete coverage
that enables users to perform ad-hoc queries as needed. They can contain joins and
they are usually configured to maximize performance through sort order,
compression, and encoding.
l Query-specific projections are optional. If you are satisfied with the performance
provided through superprojections, you do not need to create additional projections.
However, you can maximize performance by tuning for specific query work loads.

l HPE recommends that all production databases have a minimum K-safety of one
(K=1) to support high availability and recovery. (K-safety can be set to 0, 1, or 2.) See
High Availability With Projections in Vertica Concepts and Designing for K-Safety.
l Vertica recommends that if you have more than 20 nodes, but small tables, do not
create replicated projections. If you create replicated projections, the catalog
becomes very large and performance may degrade. Instead, consider segmenting
those projections.
Determining the Number of Projections to Use
In many cases, a design that consists of a set of superprojections (and their buddies)
provides satisfactory performance through compression and encoding. This is
especially true if the sort orders for the projections have been used to maximize
performance for one or more query predicates (WHERE clauses).
However, you might want to add additional query-specific projections to increase the
performance of queries that run slowly, are used frequently, or are run as part of
business-critical reporting. The number of additional projections (and their buddies) that
you create should be determined by:
l Your organization's needs
l The amount of disk space you have available on each node in the cluster
l The amount of time available for loading data into the database
As the number of projections that are tuned for specific queries increases, the
performance of these queries improves. However, the amount of disk space used and
the amount of time required to load data increases as well. Therefore, you should create
and test designs to determine the optimum number of projections for your database
configuration. On average, organizations that choose to implement query-specific
projections achieve optimal performance through the addition of a few query-specific
projections.
Designing for K-Safety
HPE recommends that all production databases have a minimum K-safety of one (K=1).
Valid K-safety values for production databases are 1 and 2. Non-production databases
do not have to be K-safe and can be set to 0.
A K-safe database must have at least three nodes, as shown in the following table:

K-safety level Number of required nodes
1 3+
2 5+
Note: Vertica only supports K-safety levels 1 and 2.
You can set K-safety to 1 or 2 only when the physical schema design meets certain
redundancy requirements. See Requirements for a K-Safe Physical Schema Design.
Using Database Designer
To create designs that are K-safe, HPE recommends that you use the Database
Designer. When creating projections with Database Designer, projection definitions that
meet K-safe design requirements are recommended and marked with a K-safety level.
Database Designer creates a script that uses the MARK_DESIGN_KSAFE function to
set the K-safety of the physical schema to 1. For example:
=> i VMart_Schema_design_opt_1.sql
CREATE PROJECTION
CREATE PROJECTION
mark_design_ksafe
----------------------
Marked design 1-safe
(1 row)
By default, Vertica creates K-safe superprojections when database K-safety is greater
than 0.
Monitoring K-Safety
Monitoring tables can be accessed programmatically to enable external actions, such as
alerts. You monitor the K-safety level by polling the SYSTEM table column and
checking the value. See SYSTEM in the SQL Reference Manual.
Loss of K-Safety
When K nodes in your cluster fail, your database continues to run, although
performance is affected. Further node failures could potentially cause the database to
shut down if the failed node's data is not available from another functioning node in the
cluster.

See Also
l K-Safety in Vertica Concepts provides a high-level view of K-safety.
l High Availability and Recovery and High Availability With Projections in Vertica
Concepts describe how Vertica implements high availability and recovery through
replication and segmentation.

Requirements for a K-Safe Physical Schema Design
Database Designer automatically generates designs with a K-safety of 1 for clusters that
contain at least three nodes. (If your cluster has one or two nodes, it generates designs
with a K-safety of 0. You can modify a design created for a three-node (or greater)
cluster, and the K-safe requirements are already set.
If you create custom projections, your physical schema design must meet the following
requirements to be able to successfully recover the database in the event of a failure:
l Segmented projections must be segmented across all nodes. Refer to Designing for
Segmentation and Designing Segmented Projections for K-Safety.
l Replicated projections must be replicated on all nodes. See Designing
Unsegmented Projections for K-Safety.
l Segmented projections must have K buddy projections (projections that have
identical columns and segmentation criteria, except that corresponding segments are
placed on different nodes).
You can use the MARK_DESIGN_KSAFE function to find out whether your schema
design meets requirements for K-safety.

Requirements for a Physical Schema Design with No K-Safety
If you use Database Designer to generate an comprehensive design that you can
modify and you do not want the design to be K-safe, set K-safety level to 0 (zero).
If you want to start from scratch, do the following to establish minimal projection
requirements for a functioning database with no K-safety (K=0):
1. Define at least one superprojection for each table in the logical schema.
2. Replicate (define an exact copy of) each dimension table superprojection on each
node.

Designing Segmented Projections for K-Safety
Projections must comply with database K-safety requirements. In general, you must
create buddy projections for each segmented projection, where the number of buddy
projections is K+1. Thus, if system K-safety is set to 1, each projection segment must be
duplicated by one buddy; if K-safety is set to 2, each segment must be duplicated by two
buddies.
Automatic Creation of Buddy Projections
You can use CREATE PROJECTION so it automatically creates the number of buddy
projections required to satisfy K-safety, by including SEGMENTED BY ... ALL NODES. If
CREATE PROJECTION specifies K-safety (KSAFE=n), Vertica uses that setting; if the
statement omits KSAFE, Vertica uses system K-safety.
In the following example, CREATE PROJECTION creates segmented projection ttt_p1
for table ttt. Because system K-safety is set to 1, Vertica requires a buddy projection
for each segmented projection. The CREATE PROJECTION statement omits KSAFE, so
Vertica uses system K-safety and creates two buddy projections: ttt_p1_b0 and ttt_
p1_b1:
=> SELECT mark_design_ksafe(1);
mark_design_ksafe
----------------------
Marked design 1-safe
(1 row)
=> CREATE TABLE ttt (a int, b int);
WARNING 6978: Table "ttt" will include privileges from schema "public"
CREATE TABLE
=> CREATE PROJECTION ttt_p1 as SELECT * FROM ttt SEGMENTED BY HASH(a) ALL NODES;
CREATE PROJECTION
=> SELECT projection_name from projections WHERE anchor_table_name='ttt';
projection_name
-----------------
ttt_p1_b0
ttt_p1_b1
(2 rows)
Vertica automatically names buddy projections by appending the suffix b_n to the
projection base name—for example ttt_p1_b0. .
Manual Creation of Buddy Projections
If you create a projection on a single node, and system K-safety is greater than 0, you
must manually create the number of buddies required for K-safety. For example, you can
create projection xxx_p1 for table xxx on a single node, as follows:

=> CREATE TABLE xxx (a int, b int);
WARNING 6978: Table "xxx" will include privileges from schema "public"
CREATE TABLE
=> CREATE PROJECTION xxx_p1 AS SELECT * FROM xxx SEGMENTED BY HASH(a) NODES v_vmart_node0001;
CREATE PROJECTION
Because K-safety is set to 1, a single instance of this projection is not K-safe. Attempts
to insert data into its anchor table xxx return with an error like this:
=> INSERT INTO xxx VALUES (1, 2);
ERROR 3586: Insufficient projections to answer query
DETAIL: No projections that satisfy K-safety found for table xxx
HINT: Define buddy projections for table xxx
In order to comply with K-safety, you must create a buddy projection for projection xxx_
p1. For example:
=> CREATE PROJECTION xxx_p1_buddy AS SELECT * FROM xxx SEGMENTED BY HASH(a) NODES v_vmart_node0002;
CREATE PROJECTION
Table xxx now complies with K-safety and accepts DML statements such as INSERT:
VMart=> INSERT INTO xxx VALUES (1, 2);
OUTPUT
--------
1
(1 row)
See Also
For general information about segmented projections and buddies, see Projection
Segmentation in Vertica Concepts. For information about designing for K-safety, see
Designing for K-Safety and Designing for Segmentation.

Designing Unsegmented Projections for K-Safety
In many cases, dimension tables are relatively small, so you do not need to segment
them. Accordingly, you should design a K-safe database so projections for its dimension
tables are replicated without segmentation on all cluster nodes. You create these
projections with a CREATE PROJECTION statement that includes the keywords
UNSEGMENTED ALL NODES. These keywords specify to create identical instances of the
projection on all cluster nodes.
The following example shows how to create an unsegmented projection for the table
store.store_dimension:
VMart=> CREATE PROJECTION store.store_dimension_proj (storekey, name, city, state)
AS SELECT store_key, store_name, store_city, store_state
CREATE PROJECTION
Vertica uses the same name to identify all instances of the unsegmented projection—in
this example, store.store_dimension_proj. For more information about projection
name conventions, see Projection Naming.
Designing for Segmentation
You segment projections using hash segmentation. Hash segmentation allows you to
segment a projection based on a built-in hash function that provides even distribution of
data across multiple nodes, resulting in optimal query execution. In a projection, the
data to be hashed consists of one or more column values, each having a large number
of unique values and an acceptable amount of skew in the value distribution. Primary
key columns that meet the criteria could be an excellent choice for hash segmentation.
Note: For detailed information about using hash segmentation in a projection, see
CREATE PROJECTION in the SQL Reference Manual.
When segmenting projections, determine which columns to use to segment the
projection. Choose one or more columns that have a large number of unique data
values and acceptable skew in their data distribution. Primary key columns are an
excellent choice for hash segmentation. The columns must be unique across all the
tables being used in a query.

Design Fundamentals
Although you can write custom projections from scratch, Vertica recommends that you
use Database Designer to create a design to use as a starting point. This ensures that
you have projections that meet basic requirements.
Writing and Deploying Custom Projections
Before you write custom projections, be sure to review the topics in Planning Your
Design carefully. Failure to follow these considerations can result in non-functional
projections.
To manually modify or create a projection:
1. Write a script to create the projection, using the CREATE PROJECTION statement.
2. Use the i meta-command in vsql to run the script.
Note: You must have a database loaded with a logical schema.
3. For a K-safe database, use the function SELECT get_projections('table_
name') to verify that the projections were properly created. Good projections are
noted as being "safe." This means that the projection has enough buddies to be K-
safe.
4. If you added the new projection to a database that already has projections that
contain data, you need to update the newly created projection to work with the
existing projections. By default, the new projection is out-of-date (not available for
query processing) until you refresh it.
5. Use the MAKE_AHM_NOW function to set the Ancient History Mark (AHM) to the
greatest allowable epoch (now).
6. Use DROP PROJECTION to drop any previous projections that are no longer
needed.
These projections can waste disk space and reduce load speed if they remain in the
database.
7. Run the ANALYZE_STATISTICS function on all projections in the database. This
function collects and aggregates data samples and storage information from all
nodes on which a projection is stored, and then writes statistics into the catalog. For

example:
=>SELECT ANALYZE_STATISTICS ('');
Designing Superprojections
Superprojections have the following requirements:
l They must contain every column within the table.
l For a K-safe design, superprojections must either be replicated on all nodes within
the database cluster (for dimension tables) or paired with buddies and segmented
across all nodes (for very large tables and medium large tables). See Physical
Schema and High Availability With Projections in Vertica Concepts for an overview
of projections and how they are stored. See Designing for K-Safety for design
specifics.
To provide maximum usability, superprojections need to minimize storage requirements
while maximizing query performance. To achieve this, the sort order for columns in
superprojections is based on storage requirements and commonly used queries.
Sort Order Benefits
Column sort order is an important factor in minimizing storage requirements, and
maximizing query performance.
Minimize Storage Requirements
Minimizing storage saves on physical resources and increases performance by
reducing disk I/O. You can minimize projection storage by prioritizing low-cardinality
columns in its sort order. This reduces the number of rows Vertica stores and accesses
to retrieve query results.
After identifying projection sort columns, analyze their data and choose the most
effective encoding method. The Vertica optimizer gives preference to columns with run-
length encoding (RLE), so be sure to use it whenever appropriate. Run-length encoding
replaces sequences (runs) of identical values with a single pair that contains the value
and number of occurrences. Therefore, it is especially appropriate to use it for low-
cardinality columns whose run length is large.
Maximize Query Performance
You can facilitate query performance through column sort order as follows:

l Where possible, sort order should prioritize columns with the lowest cardinality.
l Do not sort projections on columns of type LONG VARBINARY and LONG
VARCHAR.
For more information
See Choosing Sort Order: Best Practices for examples that address storage and query
requirements.
Choosing Sort Order: Best Practices
When choosing sort orders for your projections, Vertica has several recommendations
that can help you achieve maximum query performance, as illustrated in the following
examples.
Combine RLE and Sort Order
When dealing with predicates on low-cardinality columns, use a combination of RLE
and sorting to minimize storage requirements and maximize query performance.
Suppose you have a students table contain the following values and encoding types:
Column # of Distinct Values Encoded With
gender
2 (M or F) RLE
pass_fail
2 (P or F) RLE
class
4 (freshman, sophomore, junior, or senior) RLE
name
10000 (too many to list) Auto
You might have queries similar to this one:
SELECT name FROM studentsWHERE gender = 'M' AND pass_fail = 'P' AND class = 'senior';
The fastest way to access the data is to work through the low-cardinality columns with
the smallest number of distinct values before the high-cardinality columns. The following
sort order minimizes storage and maximizes query performance for queries that have
equality restrictions on gender, class, pass_fail, and name. Specify the ORDER BY
clause of the projection as follows:
ORDER BY students.gender, students.pass_fail, students.class, students.name

In this example, the gender column is represented by two RLE entries, the pass_fail
column is represented by four entries, and the class column is represented by 16
entries, regardless of the cardinality of the students table. Vertica efficiently finds the
set of rows that satisfy all the predicates, resulting in a huge reduction of search effort for
RLE encoded columns that occur early in the sort order. Consequently, if you use low-
cardinality columns in local predicates, as in the previous example, put those columns
early in the projection sort order, in increasing order of distinct cardinality (that is, in
increasing order of the number of distinct values in each column).
If you sort this table with student.class first, you improve the performance of queries
that restrict only on the student.class column, and you improve the compression of
the student.class column (which contains the largest number of distinct values), but
the other columns do not compress as well. Determining which projection is better
depends on the specific queries in your workload, and their relative importance.
Storage savings with compression decrease as the cardinality of the column increases;
however, storage savings with compression increase as the number of bytes required to
store values in that column increases.
Maximize the Advantages of RLE
To maximize the advantages of RLE encoding, use it only when the average run length
of a column is greater than 10 when sorted. For example, suppose you have a table with
the following columns, sorted in order of cardinality from low to high:
address.country, address.region, address.state, address.city, address.zipcode
The zipcode column might not have 10 sorted entries in a row with the same zip code,
so there is probably no advantage to run-length encoding that column, and it could

make compression worse. But there are likely to be more than 10 countries in a sorted
run length, so applying RLE to the country column can improve performance.
Put Lower Cardinality Column First for Functional Dependencies
In general, put columns that you use for local predicates (as in the previous example)
earlier in the join order to make predicate evaluation more efficient. In addition, if a lower
cardinality column is uniquely determined by a higher cardinality column (like city_id
uniquely determining a state_id), it is always better to put the lower cardinality,
functionally determined column earlier in the sort order than the higher cardinality
column.
For example, in the following sort order, the Area_Code column is sorted before the
Number column in the customer_info table:
ORDER BY = customer_info.Area_Code, customer_info.Number, customer_info.Address
In the query, put the Area_Code column first, so that only the values in the Number
column that start with 978 are scanned.
=> SELECT AddressFROM customer_info WHERE Area_Code='978' AND Number='9780123457';
Sort for Merge Joins
When processing a join, the Vertica optimizer chooses from two algorithms:
l Merge join—If both inputs are pre-sorted on the join column, the optimizer chooses a
merge join, which is faster and uses less memory.
l Hash join—Using the hash join algorithm, Vertica uses the smaller (inner) joined
table to build an in-memory hash table on the join column. A hash join has no sort

requirement, but it consumes more memory because Vertica builds a hash table with
the values in the inner table. The optimizer chooses a hash join when projections are
not sorted on the join columns.
If both inputs are pre-sorted, merge joins do not have to do any pre-processing, making
the join perform faster. Vertica uses the term sort-merge join to refer to the case when at
least one of the inputs must be sorted prior to the merge join. Vertica sorts the inner
input side but only if the outer input side is already sorted on the join columns.
To give the Vertica query optimizer the option to use an efficient merge join for a
particular join, create projections on both sides of the join that put the join column first in
their respective projections. This is primarily important to do if both tables are so large
that neither table fits into memory. If all tables that a table will be joined to can be
expected to fit into memory simultaneously, the benefits of merge join over hash join are
sufficiently small that it probably isn't worth creating a projection for any one join column.
Sort on Columns in Important Queries
If you have an important query, one that you run on a regular basis, you can save time
by putting the columns specified in the WHERE clause or the GROUP BY clause of that
query early in the sort order.
If that query uses a high-cardinality column such as Social Security number, you may
sacrifice storage by placing this column early in the sort order of a projection, but your
most important query will be optimized.
Sort Columns of Equal Cardinality By Size
If you have two columns of equal cardinality, put the column that is larger first in the sort
order. For example, a CHAR(20) column takes up 20 bytes, but an INTEGER column
takes up 8 bytes. By putting the CHAR(20) column ahead of the INTEGER column, your
projection compresses better.
Sort Foreign Key Columns First, From Low to High Distinct Cardinality
Suppose you have a fact table where the first four columns in the sort order make up a
foreign key to another table. For best compression, choose a sort order for the fact table
such that the foreign keys appear first, and in increasing order of distinct cardinality.
Other factors also apply to the design of projections for fact tables, such as partitioning
by a time dimension, if any.

In the following example, the table inventory stores inventory data, and product_key
and warehouse_key are foreign keys to the product_dimension and warehouse_
dimension tables:
=> CREATE TABLE inventory (
date_key INTEGER NOT NULL,
product_key INTEGER NOT NULL,
warehouse_key INTEGER NOT NULL,
...
);
=> ALTER TABLE inventory
ADD CONSTRAINT fk_inventory_warehouse FOREIGN KEY(warehouse_key)
REFERENCES warehouse_dimension(warehouse_key);
ALTER TABLE inventory
ADD CONSTRAINT fk_inventory_product FOREIGN KEY(product_key)
REFERENCES product_dimension(product_key);
The inventory table should be sorted by warehouse_key and then product, since the
cardinality of the warehouse_key column is probably lower that the cardinality of the
product_key.
Projection Design for Merge Operations
The Vertica query optimizer automatically picks the best projections to use for queries,
but you can help improve the performance of MERGE operations by ensuring
projections are designed for optimal use.
Good projection design lets Vertica choose the faster merge join between the target and
source tables without having to perform additional sort and data transfer operations.
HPE recommends that you first use Database Designer to generate a comprehensive
design and then customize projections, as needed. Be sure to first review the topics in
Planning Your Design. Failure to follow those considerations could result in non-
functioning projections.
In the following MERGE statement, Vertica inserts and/or updates records from the
source table's column b into the target table's column a:
=> MERGE INTO target t USING source s ON t.a = s.b WHEN ....
Vertica can use a local merge join if tables target and source use one of the following
projection designs, where their inputs are pre-sorted through the CREATE PROJECTION
ORDER BY clause:

l Replicated projections that are sorted on:
n Column a for target
n Column b for source
l Segmented projections that are identically segmented on:
n Column a for target
n Column b for source
n Corresponding segmented columns
Tip: For best merge performance, the source table should be smaller than the target
table.
See Also
l Optimized Versus Non-Optimized MERGE
Prioritizing Column Access Speed
If you measure and set the performance of storage locations within your cluster, Vertica
uses this information to determine where to store columns based on their rank. For more
information, see Setting Storage Performance.
How Columns are Ranked
Vertica stores columns included in the projection sort order on the fastest available
storage locations. Columns not included in the projection sort order are stored on slower
disks. Columns for each projection are ranked as follows:
l Columns in the sort order are given the highest priority (numbers > 1000).
l The last column in the sort order is given the rank number 1001.
l The next-to-last column in the sort order is given the rank number 1002, and so on
until the first column in the sort order is given 1000 + # of sort columns.
l The remaining columns are given numbers from 1000–1, starting with 1000 and
decrementing by one per column.

Vertica then stores columns on disk from the highest ranking to the lowest ranking. It
places highest-ranking columns on the fastest disks and the lowest-ranking columns on
the slowest disks.
Overriding Default Column Ranking
You can modify which columns are stored on fast disks by manually overriding the
default ranks for these columns. To accomplish this, set the ACCESSRANK keyword in the
column list. Make sure to use an integer that is not already being used for another
column. For example, if you want to give a column the fastest access rank, use a
number that is significantly higher than 1000 + the number of sort columns. This allows
you to enter more columns over time without bumping into the access rank you set.
The following example sets access rank to 1500 for the column C1_retail_sales_
fact_store_key:
CREATE PROJECTION retail_sales_fact_P1 (
C1_retail_sales_fact_store_key ENCODING RLE ACCESSRANK 1500,
C2_retail_sales_fact_pos_transaction_number ,
C3_retail_sales_fact_sales_dollar_amount ,
C4_retail_sales_fact_cost_dollar_amount )

Managing Users and Privileges
Database users should have access to only the database resources they need to
perform their tasks. For example, most users should be able to read data but not modify
or insert new data, while other users might need more permissive access, such as the
right to create and modify schemas, tables, and views, as well as rebalance nodes on a
cluster and start or stop a database. It is also possible to allow certain users to grant
other users access to the appropriate database resources.
Client authentication controls what database objects users can access and change in
the database. To prevent unauthorized access, a superuser limits access to what is
needed, granting privileges directly to users or to roles through a series of GRANT
statements. Roles can then be granted to users, as well as to other roles.
This section introduces the privilege role model in Vertica and describes how to create
and manage users.
See Also
l About Database Privileges
l About Database Roles
l GRANT Statements
l REVOKE Statements

About Database Users
Every Vertica database has one or more users. When users connect to a database, they
must log on with valid credentials (username and password) that a superuser defined in
the database.
Database users own the objects they create in a database, such as tables, procedures,
and storage locations.
Note: By default, users have the right to create temporary tables in a database.
See Also
l Creating a Database User
l CREATE USER
l About MC Users

Types of Database Users
In an Vertica database, there are three types of users:
l Database administrator (DBADMIN)
l Object owner
l Everyone else (PUBLIC)
Note: External to a Vertica database, an MC administrator can create users through
the Management Console and grant them database access. See About MC Users
for details.
Database Administration User
When you install a new Vertica Analytics Platform a database administration user with
access to the following roles gets created:
l DBADMIN Role
l DBDUSER Role
l PSEUDOSUPERUSER Role
Access to these roles allows this user to perform all database operations. Assign a
name to this user during installation using the --dba-user option (use -u for upgrades).
For example:
--dba-user mydba
This example creates a database administration user called mydba. The username you
use here must already exist on your operating system. See Installing Vertica with the
install_vertica Script
If you do not use --dba-user during installation the database administrator user gets
named DBADMIN by default.
Note: Do not confuse the DBADMIN user with the DBADMIN Role. The
DBADMIN role is a set of privileges you assign to a specific user based on the
user's position in your organization.

The Vertica Analytics Platform Database Administration user is also called a superuser
throughout the Vertica Analytics Platform documentation. Do not confuse this superuser
with the Linux superuser that manages the Linux operating system.
Create a Database Administration User in the Vertica Analytics Platform
As the Database Administration user you can create other users with the same
privileges:
1. Create a user:
=> CREATE USER DataBaseAdmin2;
CREATE USER
2. Grant the appropriate roles to the new user DataBaseAdmin2:
=> GRANT dbduser, dbadmin, pseudosuperuser to DataBaseAdmin2;
GRANT ROLE
The user DataBaseAdmin2 now has the same privileges granted to the original
Database Administration user.
3. As the DataBaseAdmin2 user, enable the roles using SET ROLE:
=> c VMart DataBaseAdmin2;
You are now connected to database "VMart" as user "DataBaseAdmin2".
=> SET ROLE dbadmin, dbduser, pseudosuperuser;
SET ROLE
4. Confirm the roles are enabled:
=> SHOW ENABLED ROLES;
name | setting
-------------------------------------------------
enabled roles | dbduser, dbadmin, pseudosuperuser
See Also
l DBADMIN Role
l PSEUDOSUPERUSER Role
l PUBLIC Role

Object Owner
An object owner is the user who creates a particular database object and can perform
any operation on that object. By default, only an owner (or a superuser) can act on a
database object. In order to allow other users to use an object, the owner or superuser
must grant privileges to those users using one of the GRANT Statements.
Note: Object owners are PUBLIC users for objects that other users own.
See About Database Privileges for more information.
PUBLIC User
All non-DBA (superuser) or object owners are PUBLIC users.
Note: Object owners are PUBLIC users for objects that other users own.
Newly-created users do not have access to schema PUBLIC by default. Make sure to
GRANT USAGE ON SCHEMA PUBLIC to all users you create.
See Also
l PUBLIC Role
Creating a Database User
This procedure describes how to create a new user on the database.
1. From vsql, connect to the database as a superuser.
2. Issue the CREATE USER statement with optional parameters.
3. Run a series of GRANT Statements to grant the new user privileges.
Notes
l Newly-created users do not have access to schema PUBLIC by default. Make sure to
GRANT USAGE ON SCHEMA PUBLIC to all users you create
l By default, database users have the right to create temporary tables in the database.
l If you plan to create users on Management Console, the database user account
needs to exist before you can associate an MC user with the database.
l You can change information about a user, such as his or her password, by using the
ALTER USER statement. If you want to configure a user to not have any password

authentication, you can set the empty password ‘’ in CREATE or ALTER USER
statements, or omit the IDENTIFIED BY parameter in CREATE USER.
Example
The following series of commands add user Fred to a database with password
'password. The second command grants USAGE privileges to Fred on the public
schema:
=> CREATE USER Fred IDENTIFIED BY 'password';=> GRANT USAGE ON SCHEMA PUBLIC to Fred;
User names created with double-quotes are case sensitive. For example:
=> CREATE USER "FrEd1";
In the above example, the logon name must be an exact match. If the user name was
created without double-quotes (for example, FRED1), then the user can log on as
FRED1, FrEd1, fred1, and so on.
ALTER USER and DROP USER syntax is not case sensitive.
See Also
l Granting and Revoking Privileges
l Granting Access to Database Roles
l Creating an MC User
Locking/unlocking a user's Database Access
A superuser can manually lock an existing database user's account with the ALTER
USER statement. For example, the following command prevents user Fred from logging
in to the database:
=> ALTER USER Fred ACCOUNT LOCK;
=> c - Fred
FATAL 4974: The user account "Fred" is locked
HINT: Please contact the database administrator
To grant Fred database access, use UNLOCK syntax with the ALTER USER command:
=> ALTER USER Fred ACCOUNT UNLOCK;
=> c - Fred

You are now connected as user "Fred".
Using CREATE USER to lock an account
Although not as common, you can create a new user with a locked account; for
example, you might want to set up an account for a user who doesn't need immediate
database access, as in the case of an employee who will join the company at a future
date.
=> CREATE USER Bob ACCOUNT UNLOCK;
CREATE USER
CREATE USER also supports UNLOCK syntax; however, UNLOCK is the default, so
you don't need to specify the keyword when you create a new user to whom you want to
grant immediate database access.
Locking an account automatically
Instead of manually locking an account, a superuser can automate account locking by
setting a maximum number of failed login attempts through the CREATE PROFILE
statement. See Profiles.
Changing a User's Password
A superuser can change another user's database account, including reset a password,
with the ALTER USER statement.
Making changes to a database user account with does not affect current sessions.
=> ALTER USER Fred IDENTIFIED BY 'newpassword';
In the above command, Fred's password is now newpassword.
Note: Non-DBA users can change their own passwords using the IDENTIFIED BY
'new-password' option along with the REPLACE 'old-password' clause. See ALTER
USER for details.
Changing a User's MC Password
On MC, users with ADMIN or IT privileges can reset a user's non-LDAP password from
the MC interface.
Non-LDAP passwords on MC are for MC access only and are not related to a user's
logon credentials on the Vertica database.

1. Sign in to Management Console and navigate to MC Settings > User
management.
2. Click to select the user to modify and click Edit.
3. Click Edit password and enter the new password twice.
4. Click OK and then click Save.

About Database Privileges
When a database object is created, such as a schema, table, or view, that object is
assigned an owner—the person who executed the CREATE statement. By default,
database administrators (superusers) or object owners are the only users who can do
anything with the object.
In order to allow other users to use an object, or remove a user's right to use an object,
the authorized user must grant another user privileges on the object.
Privileges are granted (or revoked) through a collection of GRANT/REVOKE statements
that assign the privilege—a type of permission that lets users perform an action on a
database object, such as:
l Create a schema
l Create a table (in a schema)
l Create a view
l View (select) data
l Insert, update, or delete table data
l Drop tables or schemas
l Run procedures
Before Vertica executes a statement, it determines if the requesting user has the
necessary privileges to perform the operation.
For more information about the privileges associated with these resources, see
Privileges That Can Be Granted on Objects.
Note: Vertica logs information about each grant (grantor, grantee, privilege, and so
on) in the V_CATALOG.GRANTS system table.
See Also
l GRANT Statements
l REVOKE Statements

Inherited Privileges Overview
Inherited privileges allow you to grant privileges at the schema level. This enables
privileges to be granted automatically to new tables or views in the schema. Existing
tables and views are unchanged when you alter the schema to include or exclude
inherited privileges. Using inherited privileges eliminates the need to apply the same
privileges to each individual table or view in the schema.
To assign inherited privileges, you must be an owner of the schema or a superuser.
Assign inherited privileges using the following SQL statements
l GRANT Statements
l CREATE SCHEMA
l ALTER SCHEMA
l CREATE TABLE
l ALTER TABLE
l CREATE VIEW
l ALTER VIEW
Granting Inherited Privileges from One User to Another
The following steps describe a process for user1 to enable inherited privileges to user2.
1. The database user, user1, creates a schema (schema1), and a table (table1) in
schema1:
user1=> CREATE SCHEMA schema1;
user1=> CREATE TABLE schema1.table1 (id int);
2. User user1 grants USAGE and CREATE privileges on schema1 to user2:
user1=> GRANT USAGE ON SCHEMA schema1 to user2;
user1=> GRANT CREATE ON SCHEMA schema1 to user2;
3. The user2 user queries schema1.table1, but the query fails:
user2=> SELECT * FROM schema1.table1;

ERROR 4367: Permission denied for relation table1
4. The user user1 grants SELECT ON SCHEMA privilege on schema1 to user2:
user1=> GRANT SELECT ON SCHEMA schema1 to user2;
5. Next, user1 uses ALTER TABLE to include SCHEMA privileges to table1:
user1=> ALTER TABLE schema1.table1 INCLUDE SCHEMA PRIVILEGES;
6. The user2 query now succeeds:
id
---
(0 rows)
7. User 1 now uses ALTER SCHEMA to include privileges so that all tables created in
schema1 inherit schema privileges:
user1=> ALTER SCHEMA schema1 DEFAULT INCLUDE PRIVILEGES;
user1=> CREATE TABLE schema1.table2 (id int);
8. With Inherited Privileges enabled, user2 can query table2 without user1 having to
specifically grant privileges on table2:
id
---
(0 rows)
Enable or Disable Inherited Privileges at the Database Level
Use the disableinheritedprivileges configuration parameter to enable (0)
Inherited Privileges:
=> ALTER DATABASE [database name] SET disableinheritedprivileges = 0;
Use the following configuration parameter to disable (1) Inherited Privileges:
=> ALTER DATABASE [database name] SET disableinheritedprivileges = 1;

Grant Inherited Privileges
Grant inherited privileges at the schema level. When inherited privileges is enabled, all
privileges granted to the schema are automatically granted to all newly created tables or
views in the schema. Existing tables or views remain unchanged when you alter the
schema to include or exclude inherited privileges.
By default, inherited privileges are enabled at the database level and disabled at the
schema level (unless you indicate otherwise while running CREATE SCHEMA). See
Enable or Disable Inherited Privileges at the Database Level for more information. To
apply inherited privileges, you must meet one of the following conditions:
l Be the owner of the object
l Be a superuser
Inherit Privileges on a Schema
Use the CREATE SCHEMA or ALTER SCHEMA SQL statements to apply inherited
privileges to a schema. The tables and views in that schema then inherit any privileges
granted to the schema by default.
This example shows how to create a new schema with inherited privileges. The
DEFAULT parameter sets the default behavior so that all new tables and views created
in this schema automatically inherit the schema's privileges:
=> CREATE SCHEMA s1 DEFAULT INCLUDE PRIVILEGES;
This example shows how to modify an existing schema to enable inherited privileges:
=> ALTER SCHEMA s1 DEFAULT INCLUDE PRIVILEGES;
The following message appears when you specify INCLUDE PRIVILEGES while
Inherited privileges is disabled at the database level:
Inherited privileges are globally disabled; schema parameter is set but has no
effect.
See Enable or Disable Inherited Privileges at the Database Level to enable Inherited
Privileges at the database level.
Inherit Privileges on a Table or Flex Table
You can specify an individual table or flex table to inherit privileges from the schema.
Use CREATE TABLE or ALTER TABLE SQL statements to enable inherited privileges

for a table. The table-level flag takes priority over the schema flag, while the database
knob takes priority over both.
This example shows creating a new table with inherited privileges:
=> CREATE TABLE s1.t1 ( x int) INCLUDE SCHEMA PRIVILEGES;
This example shows how to modify an existing table to enable inherited privileges:
=> ALTER TABLE s1.t1 INCLUDE SCHEMA PRIVILEGES;
If you run CREATE TABLE, CREATE TABLE LIKE, or CREATE TABLE
AS SELECT in a schema with inherited privileges set, the following informational
warning appears:
=> CREATE TABLE s1.t1 ( x int);
WARNING: Table <table_name> will include privileges from schema <schema_name>
Note that this message does not appear when you add the
INCLUDE SCHEMA PRIVILEGES statement.
Exclude Privileges on a Table
You can exclude a table in a schema with inherited privileges so that table does not
inherit the schema's privileges. Use CREATE TABLE or ALTER TABLE
SQL statements to exclude inherited privileges for a table.
This example shows creating a new table and excluding schema privileges:
=> CREATE TABLE s1.t1 ( x int) EXCLUDE SCHEMA PRIVILEGES;
This example shows how to modify an existing table to exclude inherited privileges:
=> ALTER TABLE s1.t1 EXCLUDE SCHEMA PRIVILEGES;
Include Privileges on a View
You can specify a View to inherit privileges from the schema. Use CREATE VIEW or
ALTER VIEW SQL statements to enable inherited privileges for a view.
This example shows creating a view with inherited privileges enabled:
=> CREATE VIEW view1 INCLUDE SCHEMA PRIVILEGES;
This example shows how to modify an existing view to enable inherited privileges:
=> ALTER VIEW veiw1 INCLUDE SCHEMA PRIVILEGES;

Exclude Privileges on a View
You can exclude a view in a schema with inherited privileges so that view does not
inherit the schema's privileges. Use CREATE VIEW or ALTER VIEW SQL statements to
exclude inherited privileges for a view.
This example shows creating a new view and excluding schema privileges:
=> CREATE VIEW view1 EXCLUDE SCHEMA PRIVILEGES;
This example shows how to modify an existing view to exclude inherited privileges:
=> ALTER VIEW view1 EXCLUDE SCHEMA PRIVILEGES;
Default Privileges for All Users
To set the minimum level of privilege for all users, Vertica has the special PUBLIC Role,
which it grants to each user automatically. This role is automatically enabled, but the
database administrator or a superuser can also grant higher privileges to users
separately using GRANT statements.
The following topics discuss those higher privileges.
Default Privileges for MC Users
Privileges on Management Console (MC) are managed through roles, which determine
a user's access to MC and to MC-managed Vertica databases through the MC interface.
MC privileges do not alter or override Vertica privileges or roles. See About MC
Privileges and Roles for details.
Privileges Required for Common Database Operations
This topic lists the required privileges for database objects in Vertica.
Unless otherwise noted, superusers can perform all of the operations shown in the
following tables without any additional privilege requirements. Object owners have the
necessary rights to perform operations on their own objects, by default.
Schemas
The PUBLIC schema is present in any newly-created Vertica database, and newly-
created users have only USAGE privilege on PUBLIC. A database superuser must
explicitly grant new users CREATE privileges, as well as grant them individual object
privileges so the new users can create or look up objects in the PUBLIC schema.

Operation Required Privileges
CREATE SCHEMA
CREATE privilege on database
DROP SCHEMA
Schema owner
ALTER SCHEMA RENAME
CREATE privilege on database
Tables
CREATE TABLE
CREATE privilege on schema
Note: Referencing sequences in the CREATE TABLE
statement requires the following privileges:
l SELECT privilege on sequence object
l USAGE privilege on sequence schema
DROP TABLE
USAGE privilege on the schema that contains the
table or schema owner
TRUNCATE TABLE
table or schema owner
ALTER TABLE ADD/DROP/
RENAME/ALTER-TYPE COLUMN
table
ALTER TABLE ADD/DROP CONSTRAINT
table
ALTER TABLE PARTITION (REORGANIZE)
table
ALTER TABLE RENAME
USAGE and CREATE privilege on the schema that
contains the table
ALTER TABLE SET SCHEMA
l CREATE privilege on new schema
l USAGE privilege on the old schema
SELECT
l SELECT privilege on table

l USAGE privilege on schema that contains the table
INSERT
l INSERT privilege on table
DELETE
l DELETE privilege on table
l SELECT privilege on the referenced table when
executing a DELETE statement that references
table column values in a WHERE or SET clause
UPDATE
l UPDATE privilege on table
l SELECT privilege on the table when executing an
UPDATE statement that references table column
values in a WHERE or SET clause
REFERENCES
l REFERENCES privilege on table to create foreign
key constraints that reference this table
l USAGE privileges on schema that contains the
constrained table and the source of the foreign k
ANALYZE_STATISTICS()
l INSERT/UPDATE/DELETE privilege on table
ANALYZE_HISTOGRAM()
DROP_STATISTICS()
DROP_PARTITION()
USAGE privilege on schema that contains the table

Views
CREATE VIEW
l CREATE privilege on the schema to contain a view
l SELECT privileges on base objects (tables/views)
l USAGE privileges on schema that contains the base objects
DROP VIEW
USAGE privilege on schema that contains the view or schema
owner
SELECT ... FROM VIEW
l SELECT privilege on view
l USAGE privilege on the schema that contains the view
Note: Privileges required on base objects for view owner must be
directly granted, not through roles:
l View owner must have SELECT ... WITH GRANT OPTION
privileges on the view's anchor tables or views if non-owner
runs a SELECT query on the view. This privilege must be
directly granted to the owner,not through a role.
l View owner must have SELECT privilege directly granted (not
through a role) on a view's base objects (table or view) if owner
runs a SELECT query on the view.
Projections
CREATE PROJECTION
l SELECT privilege on anchor tables
l USAGE privilege on schema that contains anchor tables or
schema owner
l CREATE privilege on schema to contain the projection
Note: If a projection is implicitly created with the table, no
additional privilege is needed other than privileges for table
creation.

AUTO/DELAYED PROJECTION
On projections created during INSERT..SELECT or COPY
operations:
l SELECT privilege on anchor tables
l USAGE privilege on schema that contains anchor tables
ALTER PROJECTION RENAME
USAGE and CREATE privilege on schema that contains the
projection
DROP PROJECTION
USAGE privilege on schema that contains the projection or
schema owner
External Procedures
CREATE PROCEDURE
Superuser
DROP PROCEDURE
Superuser
EXECUTE
l EXECUTE privilege on procedure
l USAGE privilege on schema that contains the
procedure
Libraries
CREATE LIBRARY
Superuser
DROP LIBRARY
Superuser
User-Defined Functions
The following abbreviations are used in the UDF table:
l UDF = Scalar
l UDT = Transform

l UDAnF= Analytic
l UDAF = Aggregate
CREATE FUNCTION (SQL)CREATE FUNCTION (UDF)
CREATE TRANSFORM FUNCTION (UDF)
CREATE ANALYTIC FUNCTION (UDAnF
CREATE AGGREGATE FUNCTION (UDAF)
l CREATE privilege on schema to contain the
function
l USAGE privilege on base library (if
applicable)
DROP FUNCTION DROP TRANSFORM FUNCTION
DROP ANALYTIC FUNCTION
DROP AGGREGATE FUNCTION
l Superuser or function owner
l USAGE privilege on schema that contains
the function
ALTER FUNCTION RENAME TO
USAGE and CREATE privilege on schema that
contains the function
ALTER FUNCTION SET SCHEMA
l USAGE privilege on schema that currently
contains the function (old schema)
l CREATE privilege on the schema to which
the function will be moved (new schema)
EXECUTE (SQL/UDF/UDT/ ADAF/UDAnF) function
l EXECUTE privilege on function
l USAGE privilege on schema that contains
the function
Sequences
CREATE SEQUENCE
CREATE privilege on schema to contain the sequence
Note: Referencing sequence in the CREATE TABLE
statement requires SELECT privilege on sequence object
and USAGE privilege on sequence schema.
CREATE TABLE with SEQUENCE
l SELECT privilege on sequence

DROP SEQUENCE
USAGE privilege on schema containing the sequence or
schema owner
ALTER SEQUENCE RENAME TO
USAGE and CREATE privileges on schema
ALTER SEQUENCE SET SCHEMA
l USAGE privilege on the schema that currently contains
the sequence (old schema)
l CREATE privilege on new schema to contain the
sequence
CURRVAL()NEXTVAL()
l SELECT privilege on sequence
Resource Pools
CREATE RESOURCE POOL
Superuser
ALTER RESOURCE POOL
Superuser on the resource pool to alter:
l MAXMEMORYSIZE
l PRIORITY
l QUEUETIMEOUT
UPDATE privilege on the resource pool to alter:
l PLANNEDCONCURRENCY
l SINGLEINITIATOR
l MAXCONCURRENCY
SET SESSION RESOURCE_POOL
l USAGE privilege on the resource pool
l Users can only change their own resource pool setting

using ALTER USER syntax
DROP RESOURCE POOL
Superuser
Users/Profiles/Roles
CREATE USER
CREATE PROFILE
CREATE ROLE
Superuser
ALTER USER
ALTER PROFILE
ALTER ROLE RENAME
Superuser
DROP USER
DROP PROFILE
DROP ROLE
Superuser
Object Visibility
You can use one or a combination of vsql d [pattern] meta commands and SQL system
tables to view objects on which you have privileges to view.
l Use dn [pattern] to view schema names and owners
l Use dt [pattern] to view all tables in the database, as well as the system table V_
CATALOG.TABLES
l Use dj [pattern] to view projections showing the schema, projection name, owner,
and node, as well as the system table V_CATALOG.PROJECTIONS
Operation Required
Privileges
Look up schema At least one
privilege on
schema that
contains the
object
Look up Object in Schema or in System USAGE

Operation Required
Privileges
Tables privilege on
schema
At least one
privilege on
any of the
following
objects:
TABLE
VIEW
FUNCTION
PROCEDURE
SEQUENCE
Look up Projection At least one
privilege on all
anchor tables
USAGE
privilege on
schema of all
anchor table
Look up resource pool SELECT
privilege on
the resource
pool
Existence of object USAGE
privilege on
the schema
that contains
the object

I/O Operations
CONNECTDISCONNECT
None
EXPORT TO Vertica
l SELECT privileges on the source table
l USAGE privilege on source table schema
l INSERT privileges for the destination table in target
database
l USAGE privilege on destination table schema
COPY FROM Vertica
l INSERT privileges for the destination table in target
database
COPY FROM file
Superuser
COPY FROM STDIN
l USAGE privilege on schema
COPY LOCAL
l USAGE privilege on schema
Comments
COMMENT ON { is one of }:
l AGGREGATE
FUNCTION
l ANALYTIC FUNCTION
Object owner or superuser

l COLUMN
l CONSTRAINT
l FUNCTION
l LIBRARY
l NODE
l PROJECTION
l SCHEMA
l SEQUENCE
l TABLE
l TRANSFORM
FUNCTION
l VIEW
Transactions
COMMIT
None
ROLLBACK
None
RELEASE SAVEPOINT
None
SAVEPOINT
None
Sessions
SET { is one of }:
l DATESTYLE
None

l ESCAPE_STRING_WARNING
l INTERVALSTYLE
l LOCALE
l ROLE
l SEARCH_PATH
l SESSION AUTOCOMMIT
l SESSION CHARACTERISTICS
l SESSION MEMORYCAP
l SESSION RESOURCE POOL
l SESSION RUNTIMECAP
l SESSION TEMPSPACE
l STANDARD_CONFORMING_
STRINGS
l TIMEZONE
SHOW { name | ALL } None
Tuning Operations
PROFILE
Same privileges required to run the query being profiled
EXPLAIN
Same privileges required to run the query for which you use the
EXPLAIN keyword

Privileges That Can Be Granted on Objects
The following table provides an overview of privileges that can be granted on (or
revoked from) database objects in Vertica:
See Also
l GRANT Statements
l REVOKE Statements
Database Privileges
Only a database superuser can create a database. In a new database, the PUBLIC Role
is granted USAGE on the automatically-created PUBLIC schema. It is up to the
superuser to grant further privileges to users and roles.
The only privilege a superuser can grant on the database itself is CREATE, which
allows the user to create a new schema in the database. For details on granting and
revoking privileges on a database, see the GRANT (Database) and REVOKE
(Database) topics in the SQL Reference Manual.
Privilege Grantor Description
CREATE
Superuser Allows a user to create a schema.

Schema Privileges
By default, only a superuser and the schema owner have privileges to create objects
within a schema. Additionally, only the schema owner or a superuser can drop or alter a
schema. See DROP SCHEMA and ALTER SCHEMA.
You must grant all new users access to the PUBLIC schema by running GRANT
USAGE ON SCHEMA PUBLIC. Then grant new users CREATE privileges and
privileges to individual objects in the schema. This enables new users to create or
locate objects in the PUBLIC schema. Without USAGE privilege, objects in the schema
cannot be used or altered, even by the object owner.
CREATE gives the schema owner or user WITH GRANT OPTION permission to create
new objects in the schema, including renaming an object in the schema or moving an
object into this schema.
Note: The schema owner is typically the user who creates the schema. However, a
superuser can create a schema and assign ownership of the schema to a different
user at creation.
All other access to the schema and its objects must be explicitly granted to users or
roles by the superuser or schema owner. This prevents unauthorized users from
accessing the schema and its objects. A user can be granted one of the following
privileges through the GRANT statement:
Privilege Description
CREATE
Allows the user to create new objects within the schema. This includes the
ability to create a new object, rename existing objects, and move objects
into the schema from other schemas.
USAGE
Permission to select, access, alter, and drop objects in the schema. The
user must also be granted access to the individual objects in order to alter
them. For example, a user would need to be granted USAGE on the
schema and SELECT on a table to be able to select data from a table.
You receive an error message if you attempt to query a table that you have
SELECT privileges on, but do not have USAGE privileges for the schema
that contains the table.
Note the following on error messages related to granting privileges on a schema or an
object:

l You attempt to grant a privilege to a schema, but you do not have USAGE privilege
for the schema. In this case, you receive an error message that the schema does not
exist.
l You attempt to grant a privilege to an object within a schema, and you have USAGE
privilege on the schema. You do not have privilege on the individual object within the
schema. In this case, you receive an error denying permission for that object.
Schema Privileges and the Search Path
The search path determines to which schema unqualified objects in SQL statements
belong.
When a user specifies an object name in a statement without supplying the schema in
which the object exists (called an unqualified object name) Vertica has two different
behaviors, depending on whether the object is being accessed or created.
Creating an object Accessing/altering an
object
When a user creates an object—such as table, view,
sequence, procedure, function—with an unqualified
name, Vertica tries to create the object in the current
schema (the first schema in the schema search path),
returning an error if the schema does not exist or if the
user does not have CREATE privileges in that schema.
Use the SHOW search_path command to view the
current search path.
=> SHOW search_path; name | setting
-------------+---------------------------------------------------
(1 row)
Note: The first schema in the search path is the current
schema, and the $user setting is a placeholder that
resolves to the current user's name.
When a user accesses or
alters an object with an
unqualified name, those
statements search through
all schemas for a matching
object, starting with the
current schema, where:
l The object name in the
schema matches the
object name in the
statement.
l The user has USAGE
privileges on the schema
in order to access object
in it.
l The user has at least one
privilege on the object.

See Also
l Setting Search Paths
l GRANT (Schema)
l REVOKE (Schema)
Table Privileges
By default, only a superuser and the table owner (typically the person who creates a
table) have access to a table. The ability to drop or alter a table is also reserved for a
superuser or table owner. This privilege cannot be granted to other users.
All other users or roles (including the user who owns the schema, if he or she does not
also own the table) must be explicitly granted using WITH GRANT OPTION syntax to
access the table.
These are the table privileges a superuser or table owner can grant:
SELECT
Permission to run SELECT queries on the table.
INSERT
Permission to INSERT data into the table.
DELETE
Permission to DELETE data from the table, as well as SELECT privilege
on the table when executing a DELETE statement that references table
column values in a WHERE or SET clause.
UPDATE
Permission to UPDATE and change data in the table, as well as SELECT
privilege on the table when executing an UPDATE statement that
references table column values in a WHERE or SET clause.
REFERENCES
Permission to CREATE foreign key constraints that reference this table.
To use any of the above privileges, the user must also have USAGE privileges on the
schema that contains the table. See Schema Privileges for details.
Referencing sequence in the CREATE TABLE statement requires the following
privileges:

l SELECT privilege on sequence object
For details on granting and revoking table privileges, see GRANT (Table) and REVOKE
(Table) in the SQL Reference Manual.
Projection Privileges
Because projections are the underlying storage construct for tables, they are atypical in
that they do not have an owner or privileges associated with them directly. Instead, the
privileges to create, access, or alter a projection are based on the anchor tables that the
projection references, as well as the schemas that contain them.
All queries in Vertica obtain data from projections directly or indirectly. In both cases, to
run a query, you must have SELECT privileges on the table or tables that the
projections reference, and USAGE privileges on all schemas that contain those tables.
You can create projections in two ways: explicitly and implicitly.
Explicit Projection Creation and Privileges
To explicitly create a projection using the CREATE PROJECTION statement, you must
be a superuser or owner of the anchor table or have the following privileges:
l CREATE privilege on the schema in which the projection is created
l SELECT privilege on all the anchor tables referenced by the projection
l USAGE privilege on all the schemas that contain the anchor tables referenced by the
projection
Only the anchor table owner can drop explicitly created projections or pre-join
projections. Explicitly created projections can be live aggregate projections, including
Top-K projections and projections with expressions.
Implicit Projection Creation and Privileges
When you insert data into a table, Vertica automatically creates a superprojection for the
table.
Superprojections do not require any additional privileges to create or drop, other than
privileges for table creation. Users who can create a table or drop a table can also
create and drop the associated superprojection.

Selecting From Projections
Vertica does not associate privileges directly with projections. Privileges can only be
granted on logical storage containers: tables and views.
The following privileges are required to select from a projection:
l SELECT privilege on each of the anchor tables referenced by the projection
l USAGE privilege on the corresponding containing schemas
View Privileges
A view is a stored query that dynamically accesses and computes data from the
database at execution time. Use dv in vsql to display available views. By default, only
the following users have privileges to access a view's base object:
l Superuser
l View owner—typically, the view creator
To execute a query that contains a view, you must have:
l SELECT privileges assigned with GRANT (View)
l USAGE privileges on the view's schema, assigned with GRANT (Schema).
You can assign view privileges to other users and roles using GRANT (View). For
example:
l Assign GRANT ALL privileges to a user or role.
=> GRANT all privileges on view1 to role1 with grant option;
l Assign GRANT ROLE privileges to a specific role to provide view privileges. In the
following example, privileges that are assigned to role1 are assigned to role2:
=> CREATE ROLE role1;
=> CREATE ROLE role2;
=> GRANT role1 to role2;
See Also
GRANT (View)
REVOKE (View)

Sequence Privileges
To create a sequence, a user must have CREATE privileges on schema that contains
the sequence. Only the owner and superusers can initially access the sequence. All
other users must be granted access to the sequence by a superuser or the owner.
Only the sequence owner (typically the person who creates the sequence) or can drop
or rename a sequence, or change the schema in which the sequence resides:
l DROP SEQUENCE: Only a sequence owner or schema owner can drop a
sequence.
l ALTER SEQUENCE RENAME TO: A sequence owner must have USAGE and
CREATE privileges on the schema that contains the sequence to be renamed.
l ALTER SEQUENCE SET SCHEMA: A sequence owner must have USAGE
privilege on the schema that currently contains the sequence (old schema), as well
as CREATE privilege on the schema where the sequence will be moved (new
schema).
The following table lists the privileges that can be granted to users or roles on
sequences.
The only privilege that can be granted to a user or role is SELECT, which allows the
user to use CURRVAL() and NEXTVAL() on sequence and reference in table. The user
or role also needs to have USAGE privilege on the schema containing the sequence.
SELECT
Permission to use CURRVAL() and NEXTVAL() on sequence and
reference in table.
USAGE
Permissions on the schema that contains the sequence.
Note: Referencing sequence in the CREATE TABLE statement requires SELECT
privilege on sequence object and USAGE privilege on sequence schema.
For details on granting and revoking sequence privileges, see GRANT (Sequence) and
REVOKE (Sequence) in the SQL Reference Manual.

See Also
l Using Named Sequences
External Procedure Privileges
Only a superuser is allowed to create or drop an external procedure.
By default, users cannot execute external procedures. A superuser must grant users and
roles this right, using the GRANT (Procedure) EXECUTE statement. Additionally, users
must have USAGE privileges on the schema that contains the procedure in order to call
it.
EXECUTE
Permission to run an external procedure.
USAGE
Permission on the schema that contains the procedure.
For details on granting and revoking external table privileges, see GRANT (Procedure)
and REVOKE (Procedure) in the SQL Reference Manual.
User-Defined Function Privileges
User-defined functions (described in CREATE FUNCTION Statements) can be created
by superusers or users with CREATE privileges on the schema that will contain the
function, as well as USAGE privileges on the base library (if applicable).
Users or roles other than the function owner can use a function only if they have been
granted EXECUTE privileges on it. They must also have USAGE privileges on the
schema that contains the function to be able to call it.
EXECUTE
Permission to call a user-defined function.
USAGE
Permission on the schema that contains the function.
l DROP FUNCTION: Only a superuser or function owner can drop the function.
l ALTER FUNCTION RENAME TO: A superuser or function owner must have USAGE
and CREATE privileges on the schema that contains the function to be renamed.
l ALTER FUNCTION SET SCHEMA: A superuser or function owner must have
USAGE privilege on the schema that currently contains the function (old schema), as

well as CREATE privilege on the schema where the function will be moved (new
schema).
For details on granting and revoking user-defined function privileges, see the following
topics in the SQL Reference Manual:
l GRANT (User Defined Extension)
l REVOKE (User Defined Extension)
Library Privileges
Only a superuser can load an external library using the CREATE LIBRARY statement.
By default, only a superuser can create user-defined functions (UDFs) based on a
loaded library. A superuser can use the GRANT USAGE ON LIBRARY statement to
allow users to create UDFs based on classes in the library. The user must also have
CREATE privileges on the schema that will contain the UDF.
USAGE
Permission to create UDFs based on classes in the library
Once created, only a superuser or the user who created a UDF can use it by default.
Either of them can grant other users or roles the ability to call the function using the
GRANT EXECUTE ON FUNCTION statement. See the GRANT (User Defined
Extension) and REVOKE (User Defined Extension) topics in the SQL Reference
Manual for more information on granting and revoking privileges on functions.
In addition to EXECUTE privilege, users/roles also require USAGE privilege on the
schema in which the function resides in order to execute the function.
For more information about libraries and UDFs, see Developing User-Defined
Extensions (UDxs) in Extending Vertica.
Resource Pool Privileges
Only a superuser can create, alter, or drop a resource pool.
By default, users are granted USAGE rights to the GENERAL pool, from which their
queries and other statements allocate memory and get their priorities. A superuser must
grant users USAGE rights to any additional resource pools by using the GRANT
USAGE ON RESOURCE POOL statement. Once granted access to the resource pool,
users can use the SET SESSION RESOURCE_POOL statement and the RESOURCE

POOL clause of the ALTER USER statement to have their queries draw their resources
from the new pool.
USAGE
Permission to use a resource pool.
SELECT
Permission to look up resource pool information/status in system tables.
UPDATE
Permission to adjust the tuning parameters of the pool.
For details on granting and revoking resource pool privileges, see GRANT (Resource
Pool) and REVOKE (Resource Pool) in the SQL Reference Manual.
Storage Location Privileges
Users and roles without superuser privileges can copy data to and from storage
locations as long as the following conditions are met, where a superuser:
1. Creates a a special class of storage location (CREATE LOCATION) specifying the
USAGE argument set to 'USER' , which indicates the specified area is accessible to
non-superusers users.
2. Grants users or roles READ and/or WRITE access to the specified location using
the GRANT (Storage Location) statement.
Note: GRANT/REVOKE (Storage Location) statements are applicable only to
'USER' storage locations.
Once such storage locations exist and the appropriate privileges are granted, users and
roles granted READ privileges can copy data from files in the storage location into a
table. Those granted WRITE privileges can export data from a table to the storage
location on which they have been granted access. WRITE privileges also let users save
COPY statement exceptions and rejected data files from Vertica to the specified storage
location.
Only a superuser can add, alter, retire, drop, and restore a location, as well as set and
measure location performance. All non-dbadmin users or roles require READ and/or
WRITE permissions on the location.

READ
Allows the user to copy data from files in the storage location into a table.
WRITE
Allows the user to copy data to the specific storage location. Users with
WRITE privileges can also save COPY statement exceptions and rejected
data files to the specified storage location.
See Also
l GRANT (Storage Location)
l Storage Management Functions
l CREATE LOCATION
Role, profile, and User Privileges
Only a superuser can create, alter or drop a:
l role
l profile
l user
By default, only the superuser can grant or revoke a role to another user or role. A user
or role can be given the privilege to grant and revoke a role by using the WITH ADMIN
OPTION clause of the GRANT statement.
For details on granting and revoking role privileges, see GRANT (Role) and REVOKE
(Role) in the SQL Reference Manual.
See Also
l CREATE USER
l ALTER USER
l DROP USER
l CREATE PROFILE
l ALTER PROFILE

l DROP PROFILE
l CREATE ROLE
l ALTER ROLE RENAME
l DROP ROLE
Metadata Privileges
A superuser has unrestricted access to all database metadata. Other users have
significantly reduced access to metadata based on their privileges, as follows:
Type of Metadata User Access
Catalog objects:
l Tables
l Columns
l Constraints
l Sequences
l External Procedures
l Projections
l ROS containers
l WOS
Users must possess USAGE
privilege on the schema and any
type of access (SELECT) or modify
privilege on the object to see
catalog metadata about the object.
See also Schema Privileges.
For internal objects like projections,
WOS and ROS containers that
don't have access privileges
directly associated with them, the
user must possess the requisite
privileges on the associated
schema and table objects instead.
For example, to see whether a
table has any data in the WOS, you
need to have USAGE on the table
schema and at least SELECT on
the table itself. See also Table
Privileges and Projection
Privileges.
User sessions and
functions, and system
tables related to these
Users can only access information
about their own, current sessions.

Type of Metadata User Access
sessions The following functions provide
restricted functionality to users:
l CURRENT_DATABASE
l CURRENT_SCHEMA
l CURRENT_USER
l HAS_TABLE_PRIVILEGE
l SESSION_USER (same as
CURRENT_USER)
The system table, SESSIONS,
provides restricted functionality to
users.
Storage locations Users require READ permissions
to copy data from storage locations.
Only a superuser can add or retire
storage locations.
I/O Privileges
Users need no special permissions to connect to and disconnect from an Vertica
database.
To EXPORT TO and COPY FROM Vertica, the user must have:
l INSERT privileges for the destination table in target database
To COPY FROM STDIN and use local COPY a user must have INSERT privileges on
the table and USAGE privilege on schema.

Note: Only a superuser can COPY from file.
Comment Privileges
A comment lets you add, revise, or remove a textual message to a database object. You
must be an object owner or superuser in order to COMMENT ON one of the following
objects:
l COLUMN
l CONSTRAINT
l FUNCTION (including AGGREGATE and ANALYTIC)
l LIBRARY
l NODE
l PROJECTION
l SCHEMA
l SEQUENCE
l TABLE
l TRANSFORM FUNCTION
l VIEW
Other users must have VIEW privileges on an object to view its comments.
Transaction Privileges
No special permissions are required for the following database operations:
l COMMIT
l ROLLBACK
l RELEASE SAVEPOINT
l SAVEPOINT

Session Privileges
No special permissions are required for users to use the SHOW statement or any of the
SET statements.
Tuning Privileges
In order to PROFILE a single SQL statement or returns a query plan's execution strategy
to standard output using the EXPLAIN command, users must have the same privileges
that are required for them to run the same query without the PROFILE or EXPLAIN
keyword.
Granting and Revoking Privileges
To grant or revoke a privilege using one of the SQL GRANT or REVOKE statements,
the user must have the following permissions for the GRANT/REVOKE statement to
succeed:
l Superuser or privilege WITH GRANT OPTION
l USAGE privilege on the schema
l Appropriate privileges on the object
The syntax for granting and revoking privileges is different for each database object,
such as schema, database, table, view, sequence, procedure, function, resource pool,
and so on.
Normally, a superuser first creates a user and then uses GRANT syntax to define the
user's privileges or roles or both. For example, the following series of statements creates
user Carol and grants Carol access to the apps database in the PUBLIC schema and
also lets Carol grant SELECT privileges to other users on the applog table:
=> CREATE USER Carol;
=> GRANT USAGE ON SCHEMA PUBLIC to Carol;
=> GRANT ALL ON DATABASE apps TO Carol;
=> GRANT SELECT ON applog TO Carol WITH GRANT OPTION;
See GRANT Statements and REVOKE Statements in the SQL Reference Manual.
About Superuser Privileges
A superuser (DBADMIN) is the automatically-created database user who has the same
name as the Linux database administrator account and who can bypass all
GRANT/REVOKE authorization, as well as supersede any user that has been granted
the PSEUDOSUPERUSER role.

Note: Database superusers are not the same as a Linux superuser with (root)
privilege and cannot have Linux superuser privilege.
A superuser can grant privileges on all database object types to other users, as well as
grant privileges to roles. Users who have been granted the role will then gain the
privilege as soon as they enable it.
Superusers may grant or revoke any object privilege on behalf of the object owner,
which means a superuser can grant or revoke the object privilege if the object owner
could have granted or revoked the same object privilege. A superuser may revoke the
privilege that an object owner granted, as well as the reverse.
Since a superuser is acting on behalf of the object owner, the GRANTOR column of V_
CATALOG.GRANTS table displays the object owner rather than the superuser who
issued the GRANT statement.
A superuser can also alter ownership of table and sequence objects.
See Also
DBADMIN Role
About Schema Owner Privileges
By default, the schema owner has privileges to create objects within a schema.
Additionally, the schema owner can drop any object in the schema, requiring no
additional privilege on the object.
The schema owner is typically the user who creates the schema.
Schema owners cannot access objects in the schema. Access to objects requires the
appropriate privilege at the object level.
All other access to the schema and its objects must be explicitly granted to users or
roles by a superuser or schema owner to prevent unauthorized users from accessing the
schema and its objects.
See Schema Privileges
About Object Owner Privileges
The database, along with every object in it, has an owner. The object owner is usually
the person who created the object, although a superuser can alter ownership of objects,
such as table and sequence.
Object owners must have appropriate schema privilege to access, alter, rename, move
or drop any object it owns without any additional privileges.

An object owner can also:
l Grant privileges on their own object to other users
The WITH GRANT OPTION clause specifies that a user can grant the permission to
other users. For example, if user Bob creates a table, Bob can grant privileges on that
table to users Ted, Alice, and so on.
l Grant privileges to roles
Users who are granted the role gain the privilege.
How to Grant Privileges
As described in Granting and Revoking Privileges, specific users grant privileges using
the GRANT statement with or without the optional WITH GRANT OPTION, which allows
the user to grant the same privileges to other users.
l A superuser can grant privileges on all object types to other users.
l A superuser or object owner can grant privileges to roles. Users who have been
granted the role then gain the privilege.
l An object owner can grant privileges on the object to other users using the optional
WITH GRANT OPTION clause.
l The user needs to have USAGE privilege on schema and appropriate privileges on
the object.
When a user grants an explicit list of privileges, such as GRANT INSERT, DELETE,
REFERENCES ON applog TO Bob:
l The GRANT statement succeeds only if all the roles are granted successfully. If any
grant operation fails, the entire statement rolls back.
l Vertica will return ERROR if the user does not have grant options for the privileges
listed.
When a user grants ALL privileges, such as GRANT ALL ON applog TO Bob, the
statement always succeeds. Vertica grants all the privileges on which the grantor has
the WITH GRANT OPTION and skips those privileges without the optional WITH
GRANT OPTION.

For example, if the user Bob has delete privileges with the optional grant option on the
applog table, only DELETE privileges are granted to Bob, and the statement succeeds:
=> GRANT DELETE ON applog TO Bob WITH GRANT OPTION;GRANT PRIVILEGE
For details, see the GRANT Statements in the SQL Reference Manual.
How to Revoke Privileges
In general, ONLY the user who originally granted a privilege can revoke it using a
REVOKE statement. That user must have superuser privilege or have the optional
WITH GRANT OPTION on the privilege. The user also must have USAGE privilege on
the schema and appropriate privileges on the object for the REVOKE statement to
succeed.
In order to revoke a privilege, this privilege must have been granted to the specified
grantee by this grantor before. If Vertica finds that to be the case, the above REVOKE
statement removes the privilege (and WITH GRANT OPTION privilege, if supplied) from
the grantee. Otherwise, Vertica prints a NOTICE that the operation failed, as in the
following example.
=> REVOKE SELECT ON applog FROM Bob;
NOTICE 0: Cannot revoke "SELECT" privilege(s) for relation "applog" that you did not grant to
"Bob"
REVOKE PRIVILEGE
The above REVOKE statement removes the privilege (and WITH GRANT OPTION
privilege, if applicable) from the grantee or it prints a notice that the operation failed.
In order to revoke grant option for a privilege, the grantor must have previously granted
the grant option for the privilege to the specified grantee. Otherwise, Vertica prints a
NOTICE.
The following REVOKE statement removes the GRANT option only but leaves the
privilege intact:
=> GRANT INSERT on applog TO Bob WITH GRANT OPTION;
GRANT PRIVILEGE
=> REVOKE GRANT OPTION FOR INSERT ON applog FROM Bob;
REVOKE PRIVILEGE
When a user revokes an explicit list of privileges, such as GRANT INSERT, DELETE,
REFERENCES ON applog TO Bob:

l The REVOKE statement succeeds only if all the roles are revoked successfully. If
any revoke operation fails, the entire statement rolls back.
l Vertica returns ERROR if the user does not have grant options for the privileges
listed.
l Vertica returns NOTICE when revoking privileges that this user had not been
previously granted.
When a user revokes ALL privileges, such as REVOKE ALL ON applog TO Bob, the
statement always succeeds. Vertica revokes all the privileges on which the grantor has
the optional WITH GRANT OPTION and skips those privileges without the WITH
GRANT OPTION.
For example, if the user Bob has delete privileges with the optional grant option on the
applog table, only grant option is revoked from Bob, and the statement succeeds without
NOTICE:
=> REVOKE GRANT OPTION FOR DELETE ON applog FROM Bob;
For details, see the REVOKE Statements in the SQL Reference Manual.
Privilege Ownership Chains
The ability to revoke privileges on objects can cascade throughout an organization. If
the grant option was revoked from a user, the privilege that this user granted to other
users will also be revoked.
If a privilege was granted to a user or role by multiple grantors, to completely revoke this
privilege from the grantee the privilege has to be revoked by each original grantor. The
only exception is a superuser may revoke privileges granted by an object owner, with
the reverse being true, as well.
In the following example, the SELECT privilege on table t1 is granted through a chain of
users, from a superuser through User3.
l A superuser grants User1 CREATE privileges on the schema s1:
=> c - dbadminYou are now connected as user "dbadmin".
=> CREATE USER User1;
CREATE USER

CREATE USER
CREATE USER
=> CREATE SCHEMA s1;
CREATE SCHEMA
=> GRANT USAGE on SCHEMA s1 TO User1, User2, User3;
GRANT PRIVILEGE
=> CREATE ROLE reviewer;
CREATE ROLE
=> GRANT CREATE ON SCHEMA s1 TO User1;
GRANT PRIVILEGE
l User1 creates new table t1 within schema s1 and then grants SELECT WITH
GRANT OPTION privilege on s1.t1 to User2:
=> c - User1You are now connected as user "User1".
=> CREATE TABLE s1.t1(id int, sourceID VARCHAR(8));
CREATE TABLE
=> GRANT SELECT on s1.t1 to User2 WITH GRANT OPTION;
GRANT PRIVILEGE
l User2 grants SELECT WITH GRANT OPTION privilege on s1.t1 to User3:
=> GRANT SELECT on s1.t1 to User3 WITH GRANT OPTION;
GRANT PRIVILEGE
l User3 grants SELECT privilege on s1.t1 to the reviewer role:
=> GRANT SELECT on s1.t1 to reviewer;
GRANT PRIVILEGE
Users cannot revoke privileges upstream in the chain. For example, User2 did not grant
privileges on User1, so when User1 runs the following REVOKE command, Vertica rolls
back the command:
=> REVOKE CREATE ON SCHEMA s1 FROM User1;
ROLLBACK 0: "CREATE" privilege(s) for schema "s1" could not be revoked from "User1"
Users can revoke privileges indirectly from users who received privileges through a
cascading chain, like the one shown in the example above. Here, users can use the
CASCADE option to revoke privileges from all users "downstream" in the chain. A
superuser or User1 can use the CASCADE option to revoke the SELECT privilege on

table s1.t1 from all users. For example, a superuser or User1 can execute the following
statement to revoke the SELECT privilege from all users and roles within the chain:
=> REVOKE SELECT ON s1.t1 FROM User2 CASCADE;
REVOKE PRIVILEGE
When a superuser or User1 executes the above statement, the SELECT privilege on
table s1.t1 is revoked from User2, User3, and the reviewer role. The GRANT privilege is
also revoked from User2 and User3, which a superuser can verify by querying the V_
CATALOG.GRANTS system table.
=> SELECT * FROM grants WHERE object_name = 's1' AND grantee ILIKE 'User%';
grantor | privileges_description | object_schema | object_name | grantee
---------+------------------------+---------------+-------------+---------
dbadmin | USAGE | | s1 | User1
(3 rows)

Modifying Privileges
A superuser or object owner can use one of the ALTER statements to modify a privilege,
such as changing a sequence owner or table owner. Reassignment to the new owner
does not transfer grants from the original owner to the new owner; grants made by the
original owner are dropped.
Changing Table Ownership
The ability to change table ownership is useful when moving a table from one schema
to another. Ownership reassignment is also useful when a table owner leaves the
company or changes job responsibilities. Because you can change the table owner, the
tables won't have to be completely rewritten, you can avoid loss in productivity.
The syntax is:
ALTER TABLE [[db-name.]schema.]table-name OWNER TO new-owner-name
In order to alter table ownership, you must be either the table owner or a superuser.
A change in table ownership transfers just the owner and not privileges; grants made by
the original owner are dropped and all existing privileges on the table are revoked from
the previous owner. However, altering the table owner transfers ownership of dependent
sequence objects (associated IDENTITY/AUTO-INCREMENT sequences) but does not
transfer ownership of other referenced sequences. See ALTER SEQUENCE for details
on transferring sequence ownership.
Notes
l Table privileges are separate from schema privileges; therefore, a table privilege
change or table owner change does not result in any schema privilege change.
l Because projections define the physical representation of the table, Vertica does not
require separate projection owners. The ability to create or drop projections is based
on the table privileges on which the projection is anchored.
l During the alter operation Vertica updates projections anchored on the table owned
by the old owner to reflect the new owner. For pre-join projection operations, Vertica
checks for privileges on the referenced table.
Example
In this example, user Bob connects to the database, looks up the tables, and transfers
ownership of table t33 from himself to to user Alice.

=> c - BobYou are now connected as user "Bob".
=> d
Schema | Name | Kind | Owner | Comment
--------+--------+-------+---------+---------
public | applog | table | dbadmin |
public | t33 | table | Bob |
(2 rows)
=> ALTER TABLE t33 OWNER TO Alice;
ALTER TABLE
Notice that when Bob looks up database tables again, he no longer sees table t33.
=> d List of tables
List of tables
--------+--------+-------+---------+---------
(1 row)
When user Alice connects to the database and looks up tables, she sees she is the
owner of table t33.
=> c - AliceYou are now connected as user "Alice".
=> d
List of tables
--------+------+-------+-------+---------
public | t33 | table | Alice |
(2 rows)
Either Alice or a superuser can transfer table ownership back to Bob. In the following
case a superuser performs the transfer.
=> ALTER TABLE t33 OWNER TO Bob;
ALTER TABLE
=> d
List of tables
--------+----------+-------+---------+---------
public | comments | table | dbadmin |
s1 | t1 | table | User1 |
(4 rows)
You can also query the V_CATALOG.TABLES system table to view table and owner
information. Note that a change in ownership does not change the table ID.
In the below series of commands, the superuser changes table ownership back to Alice
and queries the TABLES system table.

ALTER TABLE
=> SELECT table_schema_id, table_schema, table_id, table_name, owner_id, owner_name FROM tables;
table_schema_id | table_schema | table_id | table_name | owner_id | owner_name
-------------------+--------------+-------------------+------------+-------------------+-----------
-
45035996273704968 | public | 45035996273713634 | applog | 45035996273704962 | dbadmin
45035996273704968 | public | 45035996273724496 | comments | 45035996273704962 | dbadmin
45035996273730528 | s1 | 45035996273730548 | t1 | 45035996273730516 | User1
45035996273704968 | public | 45035996273795846 | t33 | 45035996273724576 | Alice
(5 rows)
Now the superuser changes table ownership back to Bob and queries the TABLES
table again. Nothing changes but the owner_name row, from Alice to Bob.
ALTER TABLE
table_schema_id | table_schema | table_id | table_name | owner_id | owner_
name-------------------+--------------+-------------------+------------+-------------------+-------
-----
45035996273730528 | s1 | 45035996273730548 | t1 | 45035996273730516 | User1
45035996273704968 | public | 45035996273793876 | foo | 45035996273724576 | Alice
45035996273704968 | public | 45035996273795846 | t33 | 45035996273714428 | Bob
(5 rows)
Table Reassignment with Sequences
Altering the table owner transfers ownership of only associated IDENTITY/AUTO-
INCREMENT sequences but not other reference sequences. For example, in the below
series of commands, ownership on sequence s1 does not change:
=> CREATE USER u1;
CREATE USER
=> CREATE USER u2;
CREATE USER
=> CREATE SEQUENCE s1 MINVALUE 10 INCREMENT BY 2;
CREATE SEQUENCE
=> CREATE TABLE t1 (a INT, id INT DEFAULT NEXTVAL('s1'));
CREATE TABLE
CREATE TABLE
=> SELECT sequence_name, owner_name FROM sequences;
sequence_name | owner_name
---------------+------------
s1 | dbadmin
(1 row)
=> ALTER TABLE t1 OWNER TO u1;
ALTER TABLE
---------------+------------
s1 | dbadmin
(1 row)

ALTER TABLE
---------------+------------
s1 | dbadmin
(1 row)
See Also
l Changing Sequence Ownership
Changing Sequence Ownership
The ALTER SEQUENCE command lets you change the attributes of an existing sequence.
All changes take effect immediately, within the same session. Any parameters not set
during an ALTER SEQUENCE statement retain their prior settings.
If you need to change sequence ownership, such as if an employee who owns a
sequence leaves the company, you can do so with the following ALTER SEQUENCE
syntax:
=> ALTER SEQUENCE sequence-name OWNER TO new-owner-name;
This operation immediately reassigns the sequence from the current owner to the
specified new owner.
Only the sequence owner or a superuser can change ownership, and reassignment
Note: Renaming a table owner transfers ownership of dependent sequence objects
(associated IDENTITY/AUTO-INCREMENT sequences) but does not transfer
ownership of other referenced sequences. See Changing Table Ownership.
Example
The following example reassigns sequence ownership from the current owner to user
Bob:
=> ALTER SEQUENCE sequential OWNER TO Bob;
See ALTER SEQUENCE in the SQL Reference Manual for details.

Use Cases
Column Access Policy Use Case
Base a column access policy on a user's role and the privileges granted to that role.
For example, in a healthcare organization, customer support representatives and
account managers have access to the same customer table. The table contains the
column SSN for storing customer Social Security numbers. to which customer support
representatives have only partial access,to view the last four digits. The account
manager, however, must be able to view entire Social Security numbers. Therefore, the
manager role has privileges to view all nine digits of the social security numbers.
When creating a column access policy, use expressions to specify exactly what different
users or roles can access within the column.
In this case, a manager can access the entire SSN column, while customer support
representatives can only access the last four digits:
=> CREATE ACCESS POLICY ON schema.customers_table
FOR COLUMN SSN
CASE
WHEN ENABLED_ROLE('manager') THEN SSN
else substr(SSN, 8, 4)
END
ENABLE;
Row Access Policy Use Case
You can also create a row access policy on the same table. For example, you can
modify access to a customer table so a manager can view data in all rows. However, a
broker can see a row only if the customer is associated with that broker:
=> select * from customers_table;
custID | password | ssn
-------+----------+---------
1 | secret | 12345678901
2 | secret | 12345678902
3 | secret | 12345678903
(3 rows)
Each customer in the customers_table has an assigned broker:

=> select * from broker_info;
broker | custID
--------+---------
u1 | 1
u2 | 2
u3 | 3
Create the access policy to allow a manager to see all data in all rows. Limit a broker's
view to only those customers to which the broker is assigned:
=> CREATE ACCESS POLICY ON schema.customers_table
FOR rows
WHERE
ENABLED_ROLE('manager')
or
(ENABLED_ROLE('broker') AND customers_table.custID in (SELECT broker_info.custID FROM broker_
info
WHERE broker = CURRENT_USER()))
ENABLE;
Access Policy Creation Workflow
You can create access policies for any table type, columnar, external, or flex. You can
also create access policies on any column type, including joins.
If no users or roles are already created, you must create them before creating an access
policy:
l Create a User
l Create a Role
l GRANT (Schema)
l GRANT (Table)
l Grant a user access to the role
l The user enables the role with the SET ROLE statement (unless the administration
user assigned a default role to the user)
l Create the access policy with the CREATE ACCESS POLICY statement.
Working With Access Policies
This section describes areas that may affect how you use access policies.

Performing Operations
Having row and column access policies enabled on a table may affect the behavior
when you attempt to perform the following DML operations:
l Insert
l Update
l Delete
l Merge
l Copy
Row Level Access Behavior
On tables where a row access policy is enabled, you can only perform DML operations
when the condition in the Row access policy evaluates to TRUE. For example:
Table1 appears as follows:
A | B
---+----
1 | 1
2 | 2
3 | 3
Create the following row access policy on Table1:
=> CREATE ACCESS POLICY on table1 for ROWS
WHERE enabled_role('manager')
OR
A<2
ENABLE;
With this policy enabled, the following behavior exists for users who want to perform
DML operations:
l A user with the manager role can perform DML on all rows in the table, because the
WHERE clause in the policy evaluates to TRUE.
l Users with non-manager roles can only perform a SELECT to return data in column A
that has a value of less than two. If the access policy has to read the data in the table
to confirm a condition, it does not allow DML operations.

Column Level Access Behavior
On tables where a column access policy is enabled, you can perform DML operations if
you can view the entire column. For example:
A | B
---+----
1 | 1
2 | 2
3 | 3
Create the following column access policy on Table1:
=> CREATE ACCESS POLICY on Table1 FOR column A NULL::int enable;
In this case users cannot perform DML operations on column A.
Important: Users who can access all the rows and columns in a table with an
access policy enabled can perform DML operations. Therefore, when you create an
access policy, make sure you construct it in a manner that all row and column data is
accessible by at least one user. This allows at least one user to perform any DML
that may be required. Otherwise, you can temporarily disable the access policy to
perform DML.
Schema Table and Privileges
Only dbadmin users can create access policies. If you want a user to be able to use
access policies, you must first assign that user the appropriate privileges.
l Grant schema or table privileges to a table non-owner to allow that user to use the
access policy.
l Revoke schema or table privileges to prohibit the user from using the access policy.
This example shows how you can create an access policy without the user being
granted privilege for the public schema:
=> CREATE ACCESS POLICY ON public.customers_table
FOR COLUMN SSN
WHEN ENABLED_ROLE('operator') THEN SUBSTR(SSN, 8, 4)

Enable and Disable Access Policy Creation
Access policies are enabled by default for all tables in the database. To disable and
enable the creation of new access policies at the database level, use the ALTER
DATABASE statement.
Disable Creation of New Access Policies
=> ALTER DATABASE dbname SET EnableAccessPolicy=0;
Enable Creation of New Access Policies
=> ALTER DATABASE dbname SET EnableAccessPolicy=1;
Limitations on Creating Access Policies with Projections
You can create access policies on columns in tables that are part of a projection.
However, you cannot create an access policy on an input table for the following
projections:
l Top-K projections
l Aggregate projections
l Projections with expressions
l Pre-join projection
Sometimes, a table already has an access policy and is part of a projection. In such
cases, if the Vertica optimizer cannot fold (or compress) the query, the access query is
blocked.
Query Optimization Considerations
When using access policies be aware of the following potential behaviors, and design
tables optimally.
Design Tables That All Authorized Users Can Access
When Database Designer creates projections for a given table, it takes into account the
access policies that apply to the current user. The set of projections that Database
Designer produces for the table are optimized for that user's access privileges, and
other users with similar access privileges. However, these projections might be less
than optimal for users with different access privileges. These differences might have
some effect on how efficiently Vertica processes queries from those users. Therefore,

when you evaluate projection designs for that table using Database Designer, design a
given table so that all authorized users have optimal access.
Avoid Performance Issues Caused by Dynamic Rewrite
To enforce row -level access policies, the system dynamically rewrites user queries.
Therefore, query performance may be affected by how row-level access policies are
written.
For example, referring to preceding Access Policy Use Cases , run the following query.
Enable both the row and column access policies on the customers_table:
=> SELECT * from customers_table;
Vertica rewrites this query plan to:
=> SELECT * from (select custID, password, CASE WHEN enabled_role('manager') THEN SSN else substr
(SSN, 8, 4) end AS SSN
FROM customers_table
WHERE
enabled_role('broker')
AND
customers_table.custID IN (SELECT broker_info.custID FROM broker_info WHERE broker = current_user
())
) customers_table;
Column Access Policy
Use the CREATE ACCESS POLICYstatement to create a column access policy for a
specific column or columns in a table. Creating an access policy depends on the
expressions specified when creating the policy, and also on the following:
l Viewing a User's Role
l Granting Privileges to Roles
Example
Run the following SQL command:
=> SELECT * FROM Table1;

A | B
--+----------
1 | one
2 | two
3 | three
4 | four
Create the following column access policy:
=> CREATE ACCESS POLICY on Table1 FOR column A NULL::int enable;
Re-run the SQL command:
=> SELECT * FROM Table1;
The following is returned:
A | B
--+----------
| one
| two
| three
| four
Note that no values appear in column A because the access policy prevents the return
of this data (NULL::int).
Creating Column Access Policies
Creating a column access policy allows different users to run the same query and
receive different results. For example, you can create an access policy authorizing
access to a column of bank account numbers. You can specify that a user with the role
employee cannot access this information. However, you do give access to a user with a
manager role.
Conditions specified in the access policy determine whether the user can see data
restricted by the policy. This example shows how you can specify that the manager role
can view the entire Social Security number while the operator role can only view the last
four digits. The first five digits are masked for the operator role (THEN SUBSTR (SSN, 8, 4)).
The 8 indicates the operator sees data starting on the eighth character (such as 123-45-
6789).
=> CREATE ACCESS POLICY ON customers_table
FOR COLUMN SSN
CASE

WHEN ENABLED_ROLE('manager') THEN SSN
WHEN ENABLED_ROLE('operator') THEN SUBSTR(SSN, 8, 4)
ELSE NULL
END
ENABLE;
Access Policy Limitations
When you use column access policies, be aware of thefollowing limitations:
l When using an access policy you cannot use any of the following in an expression:
n Aggregate functions
n Subquery
n Analytics
n UDT
l If the query cannot be folded by the Vertica optimizer, all functions other than
SELECT are blocked. The following error message appears:
ERROR 0: Unable to INSERT: "Access denied due to active access policy on table <tablename> for
column <columnname>
Note: Folding a query refers to the act of replacing deterministic expressions
involving only constants, with their computed values.
l You cannot create a column access policy on temporary tables.
l It is recommended to not use a column access policy on a flex table. If you create a
column access policy on a flex table, the following appears:
WARNING 0: Column Access Policies on flex tables may not be completely secure
Examples
The following examples show how to create a column access policy for various
situations.
Create Access Policy in Public Schema for Column in Customer Table
=> CREATE ACCESS POLICY on public.customer FOR COLUMN cid length('xxxxx') enable;
Use Expression to Further Specify Data Access and Restrictions

In this example, a user with a supervisor role can see data from the deal_size column in
the vendor_dimension table. However, a user assigned an employee role cannot.
=> CREATE ACCESS POLICY ON vendor_dimension FOR COLUMN deal_size
CASE
WHEN ENABLED_ROLE('supervisor') THEN deal_size
WHEN ENABLED_ROLE('employee') THEN NULL
END
ENABLE;
Substitute Specific Data for Actual Data in Column
In this example, the value 1000 appears rather than the actual column data:
=> CREATE ACCESS POLICY on public.customer FOR COLUMN cid 1000 enable;
=> SELECT * FROM customer;
cid | dist_code
------+----
1000 | 2
1000 | 10
(2 rows)
See Also
l CREATE ACCESS POLICY
l ALTER ACCESS POLICY
l DROP ACCESS POLICY
Enable or Disable Column Access Policy
If you have dbadmin privileges, you can enable and disable an individual access policy
in a table, as the following examples show.
Enable Column Access Policy
=> ALTER ACCESS POLICY on customer FOR column customer_key enable;
Disable Column Access Policy
=> ALTER ACCESS POLICY on customer FOR column customer_key disable;
Row Access Policy
Use the CREATE ACCESS POLICY statement to create a row access policy for a
specific row in a table. You must use a WHERE clause to set the access policy's
condition.
Example
Run the following SQL statement:

=> SELECT * FROM customers_table;
The customers_table appears as follows:
custID | password | SSN
--------+------------+--------------
1 | secret | 123456789
2 | secret | 123456780
3 | secret | 123456781
(3 rows)
Run the following SQL statement:
=> SELECT * FROM broker_info;
The broker_info table shows that each customer has an assigned broker:
broker | custID
--------+---------
user1 | 1
user2 | 2
user3 | 3
(3 rows)
Create the following access policy that only allows brokers to see customers to which
they are associated:
=> CREATE ACCESS POLICY on customers_table for rows
WHERE
ENABLED_ROLE('manager')
or
(ENABLED_ROLE('broker') AND customers_table.custID in (SELECT broker_info.custID FROM broker_
info WHERE broker = CURRENT_USER()))
ENABLE;
As user1, run the following SQL command:
user1=> SELECT * FROM customers_table;
The following is returned because user1 is associated with custID 1:
custID | password | SSN
--------+------------+--------------
1 | secret | 123456789
(1 rows)

Creating Row Access Policies
Creating a row access policy determines what rows a user can access during a query.
Row access policies include a WHERE clause that prompts the query to return only
those rows where the condition is true. For example, a user with a BROKER role should
only be able to access customer information for which the user is a broker. You can
write a predicate for this situation as follows:
WHERE ENABLED_ROLE('broker') AND customers_table.custID in (SELECT broker_info.custID FROM broker_
info WHERE broker = CURRENT_USER())
You can use a row access policy to enforce this restriction. The following example
shows how you can create a row access policy. This policy limits a user with a broker
role to access information for customers whose custID in the customers_table matches
the custID in the broker_info table.
=> CREATE ACCESS POLICY on customers_table
for rows
WHERE
ENABLED_ROLE('broker')
AND
customers_table.custID in (SELECT broker_info.custID FROM broker_info WHERE broker = CURRENT_
USER())
enable;
Row Access Policy Limitations
Be aware of the following limitations when using row access policies:
l You can only have one row access policy per table. If you need to add more later,
place the policies in a single WHERE predicate and use ALTER ACCESS POLICY
to enable the new condition.
l You cannot use row access policies on:
n Tables with pre-join projections
n Tables with aggregate projections
n Temporary tables
n System tables
If you attempt to create a row access policy on a system table the following
message appears:

=> ROLLBACK 0: Access policy cannot be created on system table <system table name>
n Views
l When a row access policy exists on a table, you cannot create directed queries on
that table.
Examples
The following examples show you can create a row access policy:
Create Access Policy in for specific row in Customer Table
=> CREATE ACCESS POLICY on customer FOR ROWS where cust_id > 3 enable;
See Also
l CREATE ACCESS POLICY
l ALTER ACCESS POLICY
l DROP ACCESS POLICY
Enable or Disable Row Access Policy
If you have dbadmin privileges, you can enabled and disable individual row access
policies in a table, as the following examples show:
Enable Row Access Policy
=> ALTER ACCESS POLICY on customer FOR rows enable;
Disable Row Access Policy
=> ALTER ACCESS POLICY on customer FOR rows disable;

About Database Roles
To make managing permissions easier, use roles. A role is a collection of privileges that
a superuser can grant to (or revoke from) one or more users or other roles. Using roles
avoids having to manually grant sets of privileges user by user. For example, several
users might be assigned to the administrator role. You can grant or revoke privileges to
or from the administrator role, and all users with access to that role are affected by the
change.
Note: Users must first enable a role before they gain all of the privileges that have
been granted to it. See Enabling Roles.
Role Hierarchies
You can also use roles to build hierarchies of roles; for example, you can create an
administrator role that has privileges granted non-administrator roles as well as to the
privileges granted directly to the administrator role. See also Role Hierarchy.
Roles do no supersede manually-granted privileges, so privileges directly assigned to a
user are not altered by roles. Roles just give additional privileges to the user.
Creating and Using a Role
Using a role follows this general flow:
1. A superuser creates a role using the CREATE ROLE statement.
2. A superuser or object owner grants privileges to the role using one of the GRANT
statements.
3. A superuser or users with administrator access to the role grant users and other
roles access to the role.
4. Users granted access to the role use the SET ROLE command to enable that role
and gain the role's privileges.
You can do steps 2 and 3 in any order. However, granting access to a role means little
until the role has privileges granted to it.
Tip: You can query the V_CATALOG system tables ROLES, GRANTS, and USERS
to see any directly-assigned roles; however, these tables do not indicate whether a
role is available to a user when roles could be available through other roles

(indirectly). See the HAS_ROLE() function for additional information.
Roles on Management Console
When users sign in to the Management Console (MC), what they can view or do is
governed by MC roles. For details, see About MC Users and About MC Privileges and
Roles.

Types of Database Roles
Vertica has the following pre-defined roles:
l PUBLIC
l PSEUDOSUPERUSER
l DBADMIN
l DBDUSER
l SYSMONITOR
Predefined roles cannot be dropped or renamed. Other roles may not be granted to (or
revoked from) predefined roles except to/from PUBLIC, but predefined roles may be
granted to other roles or users or both.
Individual privileges may be granted to/revoked from predefined roles. See the SQL
Reference Manual for all of the GRANT and REVOKE statements.
DBADMIN Role
Every database has the special DBADMIN role. A superuser (or someone with the
PSEUDOSUPERUSER Role) can grant this role to or revoke this role from any user or
role.
Users who enable the DBADMIN role gain these privileges:
l Create or drop users
l Create or drop schemas
l Create or drop roles
l Grant roles to other users
l View all system tables
l View and terminate user sessions
l Access to all data created by any user
The DBADMIN role does NOT allow users to:

l Start and stop a database
l Change DBADMIN privileges
l Set configuration parameters
Note: A user with a DBADMIN role must have the ADMIN OPTION enabled to be
able to grant a DBADMIN role to another user. A DBADMIN user cannot grant the
PSEUDOSUPERUSER role to anyone. For more information see GRANT (Role).
You can assign additional privileges to the DBADMIN role, but you cannot assign any
additional roles; for example, the following is not allowed:
=> CREATE ROLE appviewer;
CREATE ROLE
=> GRANT appviewer TO dbadmin;
ROLLBACK 2347: Cannot alter predefined role "dbadmin"
You can, however, grant the DBADMIN role to other roles to augment a set of privileges.
See Role Hierarchy for more information.
View a List of Database Superusers
To see who is a superuser, run the vsql du meta-command. In this example, only
dbadmin is a superuser.
=> du
List of users
User name | Is Superuser
-----------+--------------
dbadmin | t
Fred | f
Bob | f
Sue | f
Alice | f
User1 | f
User2 | f
User3 | f
u1 | f
u2 | f
(10 rows)
See Also
Database Administration User
DBDUSER Role
The special DBDUSER role must be explicitly granted by a superuser and is a
predefined role. The DBDUSER role allows non-DBADMIN users to access Database

Designer using command-line functions. Users with the DBDUSER role cannot access
Database Designer using the Administration Tools. Only DBADMIN users can run
You cannot assign any additional privileges to the DBDUSER role, but you can grant
the DBDUSER role to other roles to augment a set of privileges.
Once you have been granted the DBDUSER role, you must enable it before you can run
Database Designer using command-line functions. For more information, see About
Running Database Designer Programmatically.
Important: When you create a DBADMIN user or grant the DBDUSER role, make
sure to associate a resource pool with that user to manage resources during
Database Designer runs. Multiple users can run Database Designer concurrently
without interfering with each other or using up all the cluster resources. When a user
runs Database Designer, either using the Administration Tools or programmatically,
its execution is mostly contained by the user's resource pool, but may spill over into
some system resource pools for less-intensive tasks.
PSEUDOSUPERUSER Role
The special PSEUDOSUPERUSER role is automatically created in each database. A
superuser (or someone with the PSEUDOSUPERUSER role) can perform grant and
revoke on this role. The PSEUDOSUPERUSER cannot revoke or change any
superuser privileges.
Users with the PSEUDOSUPERUSER role are entitled to complete administrative
privileges, including the ability to:
l Create schemas
l Create and grant privileges to roles
l Bypass all GRANT/REVOKE authorization
l Set user account's passwords
l Lock and unlock user accounts
l Create or drop a UDF library
l Create or drop a UDF function

l Create or drop an external procedure
l Add or edit comments on nodes
l Create or drop password profiles
You cannot revoke any of these privileges from a PSEUDOSUPERUSER.
You can assign additional privileges to the PSEUDOSUPERUSER role, but you cannot
assign any additional roles; for example, the following is not allowed:
=> CREATE ROLE appviewer;
CREATE ROLE
=> GRANT appviewer TO pseudosuperuser;
ROLLBACK 2347: Cannot alter predefined role "pseudosuperuser"
PUBLIC Role
By default, every database has the special PUBLIC role. Vertica grants this role to each
user automatically, and it is automatically enabled. You grant privileges to this role that
every user should have by default. You can also grant access to roles to PUBLIC, which
allows any user to access the role using the SET ROLE statement.
Note: The PUBLIC role can never be dropped, nor can it be revoked from users or
roles.
Privileges created using the WITH GRANT OPTION cannot be granted to a Public Role:
=> CREATE TABLE t1(a int);
CREATE TABLE
=> GRANT SELECT on t1 to PUBLIC with grant option;
ROLLBACK 3484: Grant option for a privilege cannot be granted to
"public"
For more information see How to Grant Privileges.
Example
In the following example, if the superuser hadn't granted INSERT privileges on the table
publicdata to the PUBLIC group, the INSERT statement executed by user bob would
fail:
=> CREATE TABLE publicdata (a INT, b VARCHAR);
CREATE TABLE
=> GRANT INSERT, SELECT ON publicdata TO PUBLIC;

GRANT PRIVILEGE
=> CREATE PROJECTION publicdataproj AS (SELECT * FROM publicdata);
CREATE PROJECTION
dbadmin=> c - bob
You are now connected as user "bob".
=> INSERT INTO publicdata VALUES (10, 'Hello World');
OUTPUT
--------
1
(1 row)
See Also
PUBLIC User
SYSMONITOR Role
An organization's database administrator may have many responsibilities outside of
maintaining Vertica as a DBADMIN user. In this case, as the DBADMIN you may want
to delegate some Vertica administrative tasks to another Vertica user.
The DBADMIN can assign a delegate the SYSMONITOR role to grant access to
specific monitoring utilities without granting full DBADMIN access. Granting this role
allows the DBADMIN user to delegate administrative tasks without compromising
security or exposing sensitive information.
Grant a SYSMONITOR Role
To grant a user or role the SYSMONITOR role, you must be one of the following:
l a DBADMIN user
l a user assigned the SYSMONITOR who has the ADMIN OPTION
Use the GRANT (Role) SQL statement to assign a user the SYSMONITOR role. This
example shows how to grant the SYSMONITOR role to user1 and includes
administration privileges by using the WITH ADMIN OPTION parameter. The
ADMIN OPTION grants the SYSMONITOR role administrative privileges.
=> GRANT SYSMONITOR TO user1 WITH ADMIN OPTION;
This example shows how to revoke the ADMIN OPTION from the SYSMONITOR role
for user1:
=> REVOKE ADMIN OPTION for SYSMONITOR FROM user1;

Use CASCADE to revoke ADMIN OPTION privileges for all users assigned the
SYSMONITOR role:
=> REVOKE ADMIN OPTION for SYSMONITOR FROM PUBLIC CASCADE;
Example
This example shows how to:
l Create a user
l Create a role
l Grant SYSMONITOR privileges to the new role
l Grant the role to the user
=> CREATE USER user1;
=> CREATE ROLE monitor;
=> GRANT SYSMONITOR to monitor;
=> GRANT monitor to user1;
Assign SYSMONITOR Privileges
This example uses the user and role created in the Grant SYSMONITOR Role example
and shows how to:
l Create a table called personal_data
l Log in as user1
l Grant user1 the monitor role. (You already granted the monitor SYSMONITOR
privileges in the Grant a SYSMONITOR Role example.)
l Run a SELECT statement as user1
The results of the operations are based on the privilege already granted to user1.
=> CREATE TABLE personal_data (SSN varchar (256));
=> c -user1;
user1=> SET ROLE monitor;
user1=> SELECT COUNT(*) FROM TABLES;
COUNT
-------

1
(1 row)
Because you assigned the SYSMONITOR role, user1 can see the number of rows in the
Tables system table. In this simple example, there is only one table (personal_data) in
the database so the SELECT COUNT returns one row. In actual conditions, the
SYSMONITOR role would see all the tables in the database.
Check if a Table is Accessible by SYSMONITOR
Use the following command to check if a system table can be accessed by a user
assigned the SYSMONITOR role:
=> select table_name, is_monitorable from system_tables where table_
name='<table_name>';
Example
This example checks whether the current_session system table is accessible by the
SYSMONITOR:
=> select table_name, is_monitorable from system_tables where table_
name='current_session';
table_name | is_monitorable
--------------------------------
current_session | t
The t in the is_monitorable column indicates the current_session system table is
accessible by the SYSMONITOR.
Default Roles for Database Users
By default, no roles (other than the default PUBLIC Role) are enabled at the start of a
user session.
=> SHOW ENABLED_ROLES;
name | setting
---------------+---------
enabled roles |
(1 row)
A superuser can set one or more default roles for a user, which are automatically
enabled at the start of the user's session. Setting a default role is a good idea if users
normally rely on the privileges granted by one or more roles to carry out the majority of
their tasks. To set a default role, use the DEFAULT ROLE parameter of the ALTER
USER statement as superuser:

=> c vmart apps
You are now connected to database "apps" as user "dbadmin".
=> ALTER USER Bob DEFAULT ROLE logadmin;
ALTER USER
=> c - Bob;
You are now connected as user "Bob"
name | setting
---------------+----------
enabled roles | logadmin
(1 row)
Notes
l Only roles that the user already has access to can be made default.
l Unlike granting a role, setting a default role or roles overwrites any previously-set
defaults.
l To clear any default roles for a user, use the keyword NONE as the role name in the
DEFAULT ROLE argument.
l Default roles only take effect at the start of a user session. They do not affect the roles
enabled in the user's current session.
l Avoid giving users default roles that have administrative or destructive privileges (the
PSEUDOSUPERUSER role or DROP privileges, for example). By forcing users to
explicitly enable these privileges, you can help prevent accidental data loss.
Using Database Roles
There are several steps to using roles:
1. A superuser creates a role using the CREATE ROLE statement.
2. A superuser or object owner grants privileges to the role.
3. A superuser or users with administrator access to the role grant users and other
roles access to the role.
4. Users granted access to the role run the SET ROLE command to make that role
active and gain the role's privileges.
You can do steps 2 and 3 in any order. However, granting access to a role means little
until the role has privileges granted to it.

Tip: Query system tables ROLES, GRANTS, and USERS to see any directly-
assigned roles. Because these tables do not indicate whether a role is available to a
user when roles could be available through other roles (indirectly), see the HAS_
ROLE() function for additional information.
See Also
Role Hierarchy
In addition to granting roles to users, you can also grant roles to other roles. This lets
you build hierarchies of roles, with more privileged roles (an administrator, for example)
being assigned all of the privileges of lesser-privileged roles (a user of a particular
application), in addition to the privileges you assign to it directly. By organizing your
roles this way, any privilege you add to the application role (reading or writing to a new
table, for example) is automatically made available to the more-privileged administrator
role.
Example
The following example creates two roles, assigns them privileges, then assigns them to
a new administrative role.
1. Create new table applog:
=> CREATE TABLE applog (id int, sourceID VARCHAR(32), data TIMESTAMP, event VARCHAR(256));
2. Create a new role called logreader:
=> CREATE ROLE logreader;
3. Grant the logreader role read-only access on the applog table:
=> GRANT SELECT ON applog TO logreader;
4. Create a new role called logwriter:
=> CREATE ROLE logwriter;
5. Grant the logwriter write access on the applog table:

=> GRANT INSERT ON applog to logwriter;
6. Create a new role called logadmin, which will rule the other two roles:
=> CREATE ROLE logadmin;
7. Grant the logadmin role privileges to delete data:
=> GRANT DELETE ON applog to logadmin;
8. Grant the logadmin role privileges to have the same privileges as the logreader and
logwriter roles:
=> GRANT logreader, logwriter TO logadmin;
9. Create new user Bob:
=> CREATE USER Bob;
10. Give Bob logadmin privileges:
=> GRANT logadmin TO Bob;
The user Bob can now enable the logadmin role, which also includes the logreader and
logwriter roles. Note that Bob cannot enable either the logreader or logwriter role
directly. A user can only enable explicitly-granted roles.
Hierarchical roles also works with administrative access to a role:
=> GRANT logreader, logwriter TO logadmin WITH ADMIN OPTION;
GRANT ROLE
=> GRANT logadmin TO Bob;
=> c - bob; -- connect as Bob
You are now connected as user "Bob".
=> SET ROLE logadmin; -- Enable logadmin role
SET
=> GRANT logreader TO Alice;
GRANT ROLE
Note that the user Bob only has administrative access to the logreader and logwriter
roles through the logadmin role. He doesn't have administrative access to the logadmin
role, since it wasn't granted to him with the optional WITH ADMIN OPTION argument:

=> GRANT logadmin TO Alice;
WARNING: Some roles were not granted
GRANT ROLE
For Bob to be able to grant the logadmin role, a superuser would have had to explicitly
grant him administrative access.
See Also
Creating Database Roles
A superuser creates a new role using the CREATE ROLE statement. Only a superuser
can create or drop roles.
=> CREATE ROLE administrator;
CREATE ROLE
The newly-created role has no privileges assigned to it, and no users or other roles are
initially granted access to it. A superuser must grant privileges and access to the role.
Deleting Database Roles
A superuser can delete a role with the DROP ROLE statement.
Note that if any user or other role has been assigned the role you are trying to delete, the
DROP ROLE statement fails with a dependency message.
=> DROP ROLE administrator;
NOTICE: User Bob depends on Role administrator
ROLLBACK: DROP ROLE failed due to dependencies
DETAIL: Cannot drop Role administrator because other objects depend on it
HINT: Use DROP ROLE ... CASCADE to remove granted roles from the dependent users/roles
Supply the optional CASCADE parameter to drop the role and its dependencies.
=> DROP ROLE administrator CASCADE;
DROP ROLE
Granting Privileges to Roles
A superuser or owner of a schema, table, or other database object can assign privileges
to a role, just as they would assign privileges to an individual user by using the GRANT
statements described in the SQL Reference Manual . See About Database Privileges
for information about which privileges can be granted.
Granting a privilege to a role immediately affects active user sessions. When you grant a
new privilege, it becomes immediately available to every user with the role active.

Example
The following example creates two roles and assigns them different privileges on a
single table called applog.
1. Create a table called applog:
=> CREATE TABLE applog (id int, sourceID VARCHAR(32), data TIMESTAMP, event VARCHAR(256));
2. Create a new role called logreader:
=> CREATE ROLE logreader;
3. Assign read-only privileges to the logreader role on table applog:
=> GRANT SELECT ON applog TO logreader;
4. Create a role called logwriter:
=> CREATE ROLE logwriter;
5. Assign write privileges to the logwriter role on table applog:
=> GRANT INSERT ON applog TO logwriter;
See the SQL Reference Manual for the different GRANT statements.
Revoking Privileges From Roles
Use one of the REVOKE statements to revoke a privilege from a role.
=> REVOKE INSERT ON applog FROM logwriter;
REVOKE PRIVILEGE
Revoking a privilege immediately affects any user sessions that have the role active.
When you revoke a privilege, it is immediately removed from users that rely on the role
for the privilege.
See the SQL Reference Manual for the different REVOKE statements.
Granting Access to Database Roles
A pseudosuperuser or dbadmin user can assign any role to a user or to another role
using the GRANT command. The simplest form of this command is:
GRANT role [, ...] TO { user | role } [, ...]

Vertica returns a NOTICE if you grant a role to a user who has already been granted that
role. For example:
=> GRANT commenter to Bob;
NOTICE 4622: Role "commenter" was already granted to user "Bob"
See GRANT (Role) in the SQL Reference Manual for details.
Example
The following example shows how to create a role called commenter and grant that role
to user Bob:
1. Create a table called comments.
=> CREATE TABLE comments (id INT, comment VARCHAR);
2. Create a role called commenter.
=> CREATE ROLE commenter;
3. Grant privileges to the commenter role on the comments table.
=> GRANT INSERT, SELECT ON comments TO commenter;
4. Grant the commenter role to user Bob.
=> GRANT commenter TO Bob;
Before being able to access the role and its associated privileges, Bob must enable the
newly-granted role to himself.
1. Connect to the database as user Bob.
=> c - Bob
2. Enable the role.
=> SET ROLE commenter;
3. Insert some values into the comments table.

=> INSERT INTO comments VALUES (1, 'Hello World');
Based on the privileges granted to Bob by the commenter role, Bob can insert and
query the comments table.
4. Query the comments table.
=> SELECT * FROM comments;
id | comment
----+-------------
1 | Hello World
(1 row)
5. Commit the transaction.
=> COMMIT;
Note that Bob does not have proper permissions to drop the table.
=> DROP TABLE comments;ROLLBACK 4000: Must be owner of relation comments
See Also
l Granting Database Access to MC Users
Revoking Access From Database Roles
A superuser can revoke any role from a user or from another role using the REVOKE
command. The simplest form of this command is:
REVOKE role [, ...] FROM { user | role | PUBLIC } [, ...]
See REVOKE (Role) in the SQL Reference Manual for details.
Example
To revoke access from a role, use the REVOKE (Role) statement:
1. Connect to the database as a superuser:
c - dbadmin
2. Revoke the commenter role from user Bob:
=> REVOKE commenter FROM bob;

Granting Administrative Access to a Role
A superuser can assign a user or role administrative access to a role by supplying the
optional WITH ADMIN OPTION argument to the GRANT statement. Administrative
access allows the user to grant and revoke access to the role for other users (including
granting them administrative access). Giving users the ability to grant roles lets a
superuser delegate role administration to other users.
Example
The following example demonstrates granting the user bob administrative access to the
commenter role, then connecting as bob and granting a role to another user.
1. Connect to the database as a superuser (or a user with administrative access):
=> c - dbadmin
2. Grand administrative options on the commenter role to Bob
=> GRANT commenter TO Bob WITH ADMIN OPTION;
3. Connect to the database as user Bob
=> c - Bob
4. As user Bob, grant the commenter role to Alice:
=> GRANT commenter TO Alice;
Users with administrative access to a role can also grant other users administrative
access:
=> GRANT commenter TO alice WITH ADMIN OPTION;
GRANT ROLE
As with all user privilege models, database superusers should be cautious when
granting any user a role with administrative privileges. For example, if the database
superuser grants two users a role with administrative privileges, both users can revoke
the role of the other user. This example shows granting the appadmin role (with
administrative privileges) to users bob and alice. After each user has been granted
the appadmin role, either use can connect as the other will full privileges.

=> GRANT appadmin TO bob, alice WITH ADMIN OPTION;
GRANT ROLE
=> connect - bob
You are now connected as user "bob".
=> REVOKE appadmin FROM alice;
REVOKE ROLE
Revoking Administrative Access From a Role
A superuser can revoke administrative access from a role using the ADMIN OPTION
parameter with the REVOKE statement. Giving users the ability to revoke roles lets a
superuser delegate role administration to other users.
Example
The following example demonstrates revoking administrative access from Alice for the
commenter role.
1. Connect to the database as a superuser (or a user with administrative access)
c - dbadmin
2. Issue the REVOKE command with ADMIN OPTION parameters:
=> REVOKE ADMIN OPTION FOR commenter FROM alice;
Enabling Roles
By default, roles aren't enabled automatically for a user account. (See Default Roles for
Database Users for a way to make roles enabled automatically.) Users must explicitly
enable a role using the SET ROLE statement. When users enable a role in their
session, they gain all of the privileges assigned to that role. Enabling a role does not
affect any other roles that the users have active in their sessions. They can have
multiple roles enabled simultaneously, gaining the combined privileges of all the roles
they have enabled, plus any of the privileges that have been granted to them directly.
=> SELECT * FROM applog;
ERROR: permission denied for relation applog
=> SET ROLE logreader;
SET
=> SELECT * FROM applog;
id | sourceID | data | event
----+----------+----------------------------+----------------------------------------------
1 | Loader | 2011-03-31 11:00:38.494226 | Error: Failed to open source file
2 | Reporter | 2011-03-31 11:00:38.494226 | Warning: Low disk space on volume /scratch-a

(2 rows)
You can enable all of the roles available to your user account using the SET ROLE ALL
statement.
=> SET ROLE ALL;SET
name | setting
---------------+------------------------------
enabled roles | logreader, logwriter
(1 row)
See Also
l Viewing a User's Role
Disabling Roles
To disable all roles, use the SET ROLE NONE statement:
=> SET ROLE NONE;SET
name | setting
---------------+---------
enabled roles |
(1 row)
Viewing Enabled and Available Roles
You can list the roles you have enabled in your session using the SHOW ENABLED
ROLES statement:
name | setting
---------------+----------
enabled roles | logreader
(1 row)
You can find the roles available to your account using the SHOW AVAILABLE ROLES
statement:
Bob=> SHOW AVAILABLE_ROLES;
name | setting
-----------------+-----------------------------
available roles | logreader, logwriter
(1 row)

Viewing Named Roles
To view the names of all roles users can access, along with any roles that have been
assigned to those roles, query the V_CATALOG.ROLES system table.
=> SELECT * FROM roles;
role_id | name | assigned_roles
-------------------+-----------------+----------------
45035996273704964 | public |
45035996273704966 | dbduser |
45035996273704968 | dbadmin | dbduser*
45035996273704972 | pseudosuperuser | dbadmin*
45035996273704974 | logreader |
45035996273704976 | logwriter |
45035996273704978 | logadmin | logreader, logwriter
(7 rows)
Note: An asterisk (*) in the output means that role was granted WITH ADMIN
OPTION.
Viewing a User's Role
The HAS_ROLE() function lets you see if a role has been granted to a user.
Non-superusers can check their own role membership using HAS_ROLE('role_name'),
but only a superuser can look up other users' memberships using the user_name
parameter. Omitting the user_name parameter will return role results for the superuser
who is calling the function.
How to View a User's Role
In this example, user Bob wants to see if he's been assigned the logwriter command.
The output returns Boolean value t for true, denoting that Bob is assigned the specified
logwriter role:
Bob=> SELECT HAS_ROLE('logwriter');
HAS_ROLE
----------
t
(1 row)
In this example, a superuser wants to verify that the logadmin role has been granted to
user Ted:
dbadmin=> SELECT HAS_ROLE('Ted', 'logadmin');
The output returns boolean value t for true, denoting that Ted is assigned the specified
logadmin role:

HAS_ROLE
----------
t
(1 row)
Note that if a superuser omits the user_name argument, the function looks up that
superuser's role. The following output indicates that this superuser is not assigned the
logadmin role:
dbadmin=> SELECT HAS_ROLE('logadmin');
HAS_ROLE
----------
f
(1 row)
Output of the function call with user Alice indicates that she is not granted the logadmin
role:
dbadmin=> SELECT HAS_ROLE('Alice', 'logadmin');
HAS_ROLE
----------
f
(1 row)
To view additional information about users, roles and grants, you can also query the
following system tables in the V_CATALOG schema to show directly-assigned roles:
l ROLES
l GRANTS
l USERS
Note that the system tables do not indicate whether a role is available to a user when
roles could be available through other roles (indirectly). You need to call the HAS_
ROLE() function for that information.
Users
This command returns all columns from the USERS system table:
=> SELECT * FROM users;
-[ RECORD 1 ]
------------------+---------------------------
user_id | 45035996273704962
user_name | dbadmin
is_super_user | t
profile_name | default
is_locked | f

Using the Administration Tools
The Vertica Administration tools allow you to easily perform administrative tasks. You
can perform most Vertica database administration tasks with Administration Tools.
Run Administration Tools using the Database Administrator account on the
Administration host, if possible. Make sure that no other Administration Tools processes
are running.
If the Administration host is unresponsive, run Administration Tools on a different node
in the cluster. That node permanently takes over the role of Administration host.
Any user can view the man page available for admintools. Enter the following:
man admintools
Running Administration Tools
As dbadmin user, you can run administration tools. The syntax follows:
/opt/vertica/bin/admintools [
{ -h | --help }
| [--debug ]
| { -t | --tool } name_of_tool[ options]
]
Options
-h
--help
-a
--help_all
commands and options as shown in the Tools section below.
--debug If you include the debug option, Vertica logs debug
information.
Note: You can specify the debug option with or without
naming a specific tool. If you specify debug with a specific
tool, Vertica logs debug information during tool execution.
If you do not specify a tool, Vertica logs debug information
when you run tools through the admintools user interface.

{ -t | --tool }
name_of_tool
[options]
Specifies the tool to run, where name_of_tool is one of the
tools described in the help output, and options are one or
more comma-delimited tool arguments.
Note: Enter admintools -h to see the list of tools
available. Enter admintools -t name_of_tool --
help to review a specific tool's options.
An unqualified admintools command displays the Main Menu dialog box.
If you are unfamiliar with this type of interface, read Using the Administration Tools
Interface
First Login as Database Administrator
The first time you log in as the Database Administrator and run the Administration Tools,
the user interface displays.
1. In the end-user license agreement (EULA ) window, type accept to proceed.
A window displays, requesting the location of the license key file you downloaded
click OK.
Between Dialogs
While the Administration Tools are working, you see the command line processing in a
window similar to the one shown below. Do not interrupt the processing.

Using the Administration Tools Interface
The Vertica Administration Tools are implemented using Dialog, a graphical user
interface that works in terminal (character-cell) windows.The interface responds to
mouse clicks in some terminal windows, particularly local Linux windows, but you might
find that it responds only to keystrokes. Thus, this section describes how to use the
Administration Tools using only keystrokes.
Note: This section does not describe every possible combination of keystrokes you
can use to accomplish a particular task. Feel free to experiment and to use whatever
keystrokes you prefer.
Enter [Return]
In all dialogs, when you are ready to run a command, select a file, or cancel the dialog,
press the Enter key. The command descriptions in this section do not explicitly instruct
you to press Enter.
OK - Cancel - Help
The OK, Cancel, and Help buttons are
present on virtually all dialogs. Use the
tab, space bar, or right and left arrow
keys to select an option and then press
Enter. The same keystrokes apply to
dialogs that present a choice of Yes or
No.

Menu Dialogs
Some dialogs
require that you
choose one
command from a
menu. Type the
alphanumeric
character shown or
use the up and
down arrow keys to
select a command
and then press
Enter.
List Dialogs
In a list dialog, use the up and down arrow
keys to highlight items, then use the space bar
to select the items (which marks them with an
X). Some list dialogs allow you to select
multiple items. When you have finished
selecting items, press Enter.
Form Dialogs
In a form dialog (also referred to as a dialog box), use the tab key to cycle between OK,
Cancel, Help, and the form field area. Once the cursor is in the form field area, use the
up and down arrow keys to select an individual field (highlighted) and enter information.
When you have finished entering information in all fields, press Enter.

Help Buttons
Online help is provided in the form of text dialogs. If you have trouble viewing the help,
see Notes for Remote Terminal Users in this document.
K-Safety Support in Administration Tools
The Administration Tools allow certain operations on a K-Safe database, even if some
nodes are unresponsive.
The database must have been marked as K-Safe using the MARK_DESIGN_KSAFE
function.
The following management functions within the Administration Tools are operational
when some nodes are unresponsive.
Note: Vertica users can perform much of the below functionality using the
Management Console interface. See Management Console and Administration
Tools for details.
l View database cluster state
l Connect to database
l Start database (including manual recovery)
l Stop database
l Replace node (assuming node that is down is the one being replaced)

l View database parameters
l Upgrade license key
The following operations work with unresponsive nodes; however, you might have to
repeat the operation on the failed nodes after they are back in operation:
l Distribute config files
l Install external procedure
l (Setting) database parameters
The following management functions within the Administration Tools require that all
nodes be UP in order to be operational:
l Create database
l Run the Database Designer
l Drop database
l Set restart policy
l Roll back database to Last Good Epoch
Notes for Remote Terminal Users
The appearance of the graphical interface depends on the color and font settings used
by your terminal window. The screen captures in this document were made using the
default color and font settings in a PuTTy terminal application running on a Windows
platform.
Note: If you are using a remote terminal application, such as PuTTy or a Cygwin
bash shell, make sure your window is at least 81 characters wide and 23 characters
high.
If you are using PuTTY, you can make the Administration Tools look like the screen
captures in this document:

1. In a PuTTY window, right click the title area and select Change Settings.
3. In the Category dialog, click Window > Appearance.
4. In the Font settings, click the Change... button.
5. Select Font: Courier New: Regular Size: 10
6. Click Apply.
Repeat these steps for each existing session that you use to run the Administration
Tools.
You can also change the translation to support UTF-8:
1. In a PuTTY window, right click the title area and select Change Settings.
3. In the Category dialog, click Window > Translation.
4. In the "Received data assumed to be in which character set" drop-down menu,
select UTF-8.
5. Click Apply.
Using Administration Tools Help
The Help on Using the Administration Tools command displays a help screen about
using the Administration Tools.

Most of the online help in the Administration Tools is context-sensitive. For example, if
you use up/down arrows to select a command, press tab to move to the Help button, and
press return, you get help on the selected command.
In a Menu Dialog
1. Use the up and down arrow keys to choose the command for which you want help.
2. Use the Tab key to move the cursor to the Help button.
3. Press Enter (Return).
In a Dialog Box
1. Use the up and down arrow keys to choose the field on which you want help.

2. Use the Tab key to move the cursor to the Help button.
3. Press Enter (Return).
Scrolling
Some help files are too long for a single screen. Use the up and down arrow keys to
scroll through the text.
Password Authentication
When you create a new user with the CREATE USER command, you can configure the
password or leave it empty. You cannot bypass the password if the user was created
with a password configured. You can change a user's password using the ALTER
USER command.
See Implementing Security for more information about controlling database
authorization through passwords.
Tip: Unless the database is used solely for evaluation purposes, HPE recommends
that all database users have encrypted passwords.
Distributing Changes Made to the Administration Tools
Metadata
Administration Tools-specific metadata for a failed node will fall out of synchronization
with other cluster nodes if you make the following changes:
l Modify the restart policy
l Add one or more nodes
l Drop one or more nodes.
When you restore the node to the database cluster, you can use the Administration
Tools to update the node with the latest Administration Tools metadata:
1. Log on to a host that contains the metadata you want to transfer and start the
Administration Tools. (See Using the Administration Tools.)
2. On the Main Menu in the Administration Tools, select Configuration Menu and
click OK.

3. On the Configuration Menu, select Distribute Config Files and click OK.
4. Select AdminTools Meta-Data.
The Administration Tools metadata is distributed to every host in the cluster.
5. Restart the database.
Administration Tools and Management Console
You can perform most database administration tasks using the Administration Tools, but
you have the additional option of using the more visual and dynamic Management
Console.
The following table compares the functionality available in both interfaces. Continue to
use Administration Tools and the command line to perform actions not yet supported by
Management Console.
Vertica Functionality Management
Console
Administration
Tools
Use a Web interface for the administration of
Vertica
Yes No
Manage/monitor one or more databases and
clusters through a UI
Yes No
Manage multiple databases on different clusters Yes Yes
View database cluster state Yes Yes
View multiple cluster states Yes No
Connect to the database Yes Yes
Start/stop an existing database Yes Yes
Stop/restart Vertica on host Yes Yes
Kill an Vertica process on host No Yes
Create one or more databases Yes Yes

Console
Administration
Tools
View databases Yes Yes
Remove a database from view Yes No
Drop a database Yes Yes
Create a physical schema design (Database
Designer)
Yes Yes
Modify a physical schema design (Database
Designer)
Yes Yes
Set the restart policy No Yes
Roll back database to the Last Good Epoch No Yes
Manage clusters (add, replace, remove hosts) Yes Yes
Rebalance data across nodes in the database Yes Yes
Configure database parameters dynamically Yes No
View database activity in relation to physical
resource usage
Yes No
View alerts and messages dynamically Yes No
View current database size usage statistics Yes No
View database size usage statistics over time Yes No
Upload/upgrade a license file Yes Yes
Warn users about license violation on login Yes Yes
Create, edit, manage, and delete users/user
information
Yes No
Use LDAP to authenticate users with company
credentials
Yes Yes

Console
Administration
Tools
Manage user access to MC through roles Yes No
Map Management Console users to an Vertica
database
Yes No
Enable and disable user access to MC and/or
the database
Yes No
Audit user activity on database Yes No
Hide features unavailable to a user through
roles
Yes No
Generate new user (non-LDAP) passwords Yes No
Management Console Provides some, but Not All of the Functionality Provided By the
Administration Tools. MC Also Provides Functionality Not Available in the
See Also
l Monitoring Vertica Using Management Console

Administration Tools Reference
The Administration tools allow you to:
l View the Database Cluster State
l Connect to the database
l Stop the database
l Configure Menu items
l Use Advanced Menu options
l Write Administration Tools scripts
Viewing Database Cluster State
This tool shows the current state of the nodes in the database.
1. On the Main Menu, select View Database Cluster State, and click OK.
The normal state of a running database is ALL UP. The normal state of a stopped
database is ALL DOWN.
2. If some hosts are UP and some DOWN, restart the specific host that is down using
Restart Vertica on Host from the Administration Tools, or you can start the
database as described in Starting and Stopping the Database (unless you have a
known node failure and want to continue in that state.)

Nodes shown as INITIALIZING or RECOVERING indicate that Failure Recovery is
in progress.
Nodes in other states (such as NEEDS_CATCHUP) are transitional and can be ignored
unless they persist.
See Also
l Advanced Menu Options
Connecting to the Database
This tool connects to a running database with vsql. You can use the Administration
Tools to connect to a database from any node within the database while logged in to
any user account with access privileges. You cannot use the Administration Tools to
connect from a host that is not a database node. To connect from other hosts, run vsql
as described in Connecting from the Command Line.
1. On the Main Menu, click Connect to Database, and then click OK.
2. Supply the database password if asked:
Password:
When you create a new user with the CREATE USER command, you can configure
the password or leave it empty. You cannot bypass the password if the user was
created with a password configured. You can change a user's password using the
ALTER USER command.
The Administration Tools connect to the database and transfer control to vsql.

Welcome to vsql, the Vertica Analytic Database interactive terminal.
Type: h or ? for help with vsql commands
g or terminate with semicolon to execute query
q to quit
=>
See Using vsql for more information.
Note: After entering your password, you may be prompted to change your password
if it has expired. See Implementing Client Authentication for details of password
security.
See Also
l CREATE USER
l ALTER USER
Start the Database
Starting a K-safe database is supported when up to K nodes are down or unavailable.
See Failure Recovery for a discussion on various scenarios encountered during
database shutdown, startup and recovery.
You can start a database using any of these methods:
l The Management Console
l The Administration Tools interface
l The command line
Start the Database Using MC
On MC's Databases and Clusters page, click a database to select it, and click Start
within the dialog box that displays.
Start the Database Using the Administration Tools
1. Open the Administration Tools and select View Database Cluster State to make
sure that all nodes are down and that no other database is running.
2. Open the Administration Tools. See Using the Administration Tools for information
about accessing the Administration Tools.

3. On the Main Menu, select Start Database,and then select OK.
4. Select the database to start, and then click OK.
Caution: HPE strongly recommends that you start only one database at a time.
If you start more than one database at any time, the results are unpredictable.
Users could encounter resource conflicts or perform operations in the wrong
database.
5. Enter the database password, and then click OK.
6. When prompted that the database started successfully, click OK.
7. Check the log files to make sure that no startup problems occurred.
Start the Database At the Command Line
If you use the admintools command line option, start_db, to start a database, the -p
password argument is only required during database creation, when you install a new
license.
As long as the license is valid, the -p argument is not required to start the database and
is silently ignored, even if you introduce a typo or prematurely press the enter key. This
is by design, as the database can only be started by the user who (as part of the
verticadba UNIX user group) initially created the database or who has root or su
privileges.
If the license were to become invalid, Vertica would use the -p password argument to
attempt to upgrade the license with the license file stored in
/opt/vertica/config/share/license.key.
Following is an example of using start_db on a standalone node:
$ /opt/vertica/bin/admintools -t start_db -d VMartInfo: no password specified, using none
Node Status: v_vmart_node0001: (DOWN)
Node Status: v_vmart_node0001: (UP)
Database VMart started successfully

Stopping a Database
To stop a running database, take these steps:
1. Use View Database Cluster State to make sure that all nodes are up. If all nodes are
not up, see Restarting Vertica on Host.
2. On the Main Menu, select Stop Database, and click OK.
3. Select the database you want to stop, and click OK.
4. Enter the password if asked, and click OK.
5. A message confirms that the database has been successfully stopped. Click OK.
Error
If users are connected during shutdown operations, you cannot stop a database. The
Administration Tools displays a message similar to the following:
Unable to shutdown database VMart.
Error: NOTICE 2519: Cannot shut down while users are connected
This may be because other users still have active sessions
or the Management Console is still active. You can force
the sessions to terminate and shutdown the database, but
any work done in the other sessions may be lost.
Do you want to try a forced shutdown?
Description
The message indicates that there are active user connections (sessions). For example,
Database Designer may be building or deploying a design. See Managing Sessions in
the Administrator's Guide for more information.
Resolution
The following examples were taken from a different database.
1. To see which users are connected, connect to the database and query the
SESSIONS system table described in the SQL Reference Manual. For example:
=> pset expanded
Expanded display is on.
=> SELECT * FROM SESSIONS;
-[ RECORD 1 ]
node_name | site01

user_name | dbadmin
client_hostname | 127.0.0.1:57141
login_timestamp | 2015-06-07 14:41:26
session_id | rhel6-1-30361:0xd7e3e:994462853
transaction_start | 2015-06-07 14:48:54
transaction_id | 45035996273741092
transaction_description | user dbadmin (select * from session;)
statement_start | 2015-06-07 14:53:31
statement_id | 0
last_statement_duration | 1
current_statement | select * from sessions;
ssl_state | None
authentication_method | Trust
-[ RECORD 2 ]
node_name | site01
user_name | dbadmin
login_timestamp | 2015-06-07 14:52:55
session_id | rhel6-1-30361:0xd83ac:1017578618
transaction_id | 45035996273741096
transaction_description | user dbadmin (COPY ClickStream_Fact FROM
'/data/clickstream/1g/ClickStream_Fact.tbl'
DELIMITER '|' NULL 'n' DIRECT;)
statement_start | 2015-06-07 14:53:26
statement_id | 17179869528
last_statement_duration | 0
current_statement | COPY ClickStream_Fact FROM
DELIMITER '|' NULL 'n' DIRECT;
ssl_state | None
The current statement column of Record 1 shows that session is the one you are using
to query the system table. Record 2 shows the session that must end before the
database can be shut down.
1. If a statement is running in a session, that session must be closed. Use the function
CLOSE_SESSION or CLOSE_ALL_SESSIONS described in the SQL Reference
Manual.
Note: CLOSE_ALL_SESSIONS is the more common command because it
forcefully disconnects all user sessions.
-[ RECORD 1 ]
node_name | site01
user_name | dbadmin

2. Query the SESSIONS table again. For example, two columns have changed:
n stmtid is now 0, indicating that no statement is in progress.
n stmt_duration now indicates how long the statement ran in milliseconds before
being interrupted.
The SELECT statements that call these functions return when the interrupt or close
message has been delivered to all nodes, not after the interrupt or close has
completed.
3. Query the SESSIONS table again. When the session no longer appears in the
SESSION table, disconnect and run the Stop Database command.
Controlling Sessions
The database administrator must be able to disallow new incoming connections in order
to shut down the database. On a busy system, database shutdown is prevented if new
sessions connect after the CLOSE_SESSION or CLOSE_ALL_SESSIONS() command
is invoked—and before the database actually shuts down.
One option is for the administrator to issue the SHUTDOWN('true') command, which
forces the database to shut down and disallow new connections. See SHUTDOWN in
the SQL Reference Manual.
Another option is to modify the MaxClientSessions parameter from its original value to
0, in order to prevent new non-dbadmin users from connecting to the database.
1. Determine the original value for the MaxClientSessions parameter by querying
the V_MONITOR.CONFIGURATIONS_PARAMETERS system table:
=> SELECT CURRENT_VALUE FROM CONFIGURATION_PARAMETERS WHERE parameter_
name='MaxClientSessions';
CURRENT_VALUE
---------------
50
(1 row)
2. Set the MaxClientSessions parameter to 0 to prevent new non-dbadmin
connections:

=> ALTER DATABASE mydb SET MaxClientSessions = 0;
Note: The previous command allows up to five administrators to log in.
3. Issue the CLOSE_ALL_SESSIONS() command to remove existing sessions:
=> SELECT CLOSE_ALL_SESSIONS();
4. Query the SESSIONS table:
When the session no longer appears in the SESSIONS table, disconnect and run the
Stop Database command.
5. Restart the database.
6. Restore the MaxClientSessions parameter to its original value:
Notes
You cannot stop databases if your password has expired. The Administration Tools
displays an error message if you attempt to do so. You need to change your expired
password using vsql before you can shut down a database.
Restarting Vertica on Host
This tool restarts the Vertica process one or more nodes in a running database. Use this
tool when a cluster host reboots while the database is running. The spread daemon
starts automatically but the Vertica process does not, thus the node does not
automatically rejoin the cluster.
1. On the Main Menu, select View Database Cluster State, and click OK.
2. If one or more nodes are down, select Restart Vertica on Host, and click OK.
3. Select the database that contains the host that you want to restart, and click OK.
4. Select the Host that you want to restart, and click OK.

5. Select View Database Cluster State again to make sure that all nodes are up.

Configuration Menu Item
The Configuration Menu allows you to:
l Create, drop, and view databases.
l Use the Database Designer to create or modify a physical schema design.
l Set a restart policy.
l Distribute config files.
l Install external procedures.
Creating a Database
1. On the Configuration Menu, click Create Database and then click OK.
2. Enter the name of the database and an optional comment. Click OK.
3. Enter a password. See Creating a Database Name and Password for rules.
If you do not enter a password, you are prompted to indicate whether you want to
enter a password. Click Yes to enter a password or No to create a database without
a superuser password.
Caution: If you do not enter a password at this point, superuser password is set
to empty. Unless the database is for evaluation or academic purposes, HPE
strongly recommends that you enter a superuser password.
4. If you entered a password, enter the password again.
5. Select the hosts to include in the database. The hosts in this list are the ones that
were specified at installation time (install_vertica -s).

6. Specify the directories in which to store the catalog and data files.
Note: Catalog and data paths must contain only alphanumeric characters and
cannot have leading space characters. Failure to comply with these restrictions
could result in database creation failure.
7. Check the current database definition for correctness, and click Yes to proceed.

8. A message indicates that you have successfully created a database. Click OK.
Dropping a Database
This tool drops an existing database. Only the Database Administrator is allowed to
drop a database.
1. Stop the database as described in Stopping a Database.
2. On the Configuration Menu, click Drop Database and then click OK.
3. Select the database to drop and click OK.
4. Click Yes to confirm that you want to drop the database.
5. Type yes and click OK to reconfirm that you really want to drop the database.
6. A message indicates that you have successfully dropped the database. Click OK.
Notes
In addition to dropping the database, Vertica automatically drops the node definitions
that refer to the database unless:
l Another database uses a node definition. If another database refers to any of these
node definitions, none of the node definitions are dropped.
l A node definition is the only node defined for the host. (Vertica uses node definitions
to locate hosts that are available for database creation, so removing the only node
defined for a host would make the host unavailable for new databases.)

Viewing a Database
This tool displays the characteristics of an existing database.
1. On the Configuration Menu, select View Database and click OK.
2. Select the database to view.
3. Vertica displays the following information about the database:
n The name of the database.
n The name and location of the log file for the database.
n The hosts within the database cluster.
n The value of the restart policy setting.
Note: This setting determines whether nodes within a K-Safe database are
restarted when they are rebooted. See Setting the Restart Policy.
n The database port.
n The name and location of the catalog directory.
Setting the Restart Policy
The Restart Policy enables you to determine whether or not nodes in a K-Safe database
are automatically restarted when they are rebooted. Since this feature does not
automatically restart nodes if the entire database is DOWN, it is not useful for databases
that are not K-Safe.
To set the Restart Policy for a database:

1. Open the Administration Tools.
2. On the Main Menu, select Configuration Menu, and click OK.
3. In the Configuration Menu, select Set Restart Policy, and click OK.
4. Select the database for which you want to set the Restart Policy, and click OK.
5. Select one of the following policies for the database:
n Never — Nodes are never restarted automatically.
n K-Safe — Nodes are automatically restarted if the database cluster is still UP.
This is the default setting.
n Always — Node on a single node database is restarted automatically.
Note: Always does not work if a single node database was not shutdown
cleanly or crashed.
6. Click OK.
Best Practice for Restoring Failed Hardware
Following this procedure will prevent Vertica from misdiagnosing missing disk or bad
mounts as data corruptions, which would result in a time-consuming, full-node recovery.
If a server fails due to hardware issues, for example a bad disk or a failed controller,
upon repairing the hardware:
1. Reboot the machine into runlevel 1, which is a root and console-only mode.
Runlevel 1 prevents network connectivity and keeps Vertica from attempting to
reconnect to the cluster.
2. In runlevel 1, validate that the hardware has been repaired, the controllers are
online, and any RAID recover is able to proceed.
Note: You do not need to initialize RAID recover in runlevel 1; simply validate

that it can recover.
3. Once the hardware is confirmed consistent, only then reboot to runlevel 3 or higher.
At this point, the network activates, and Vertica rejoins the cluster and automatically
recovers any missing data. Note that, on a single-node database, if any files that were
associated with a projection have been deleted or corrupted, Vertica will delete all files
associated with that projection, which could result in data loss.
Installing External Procedure Executable Files
1. Run the Administration Tools.
$ /opt/vertica/bin/adminTools
2. On the AdminTools Main Menu, click Configuration Menu, and then click OK.
3. On the Configuration Menu, click Install External Procedure and then click OK.
4. Select the database on which you want to install the external procedure.
5. Either select the file to install or manually type the complete file path, and then click
OK.
6. If you are not the superuser, you are prompted to enter your password and click OK.

The Administration Tools automatically create the <database_catalog_
path>/procedures directory on each node in the database and installs the
external procedure in these directories for you.
7. Click OK in the dialog that indicates that the installation was successful.

Advanced Menu Options
This Advanced Menu allows you to:
l Rollback the database to the last good epoch.
l Stop Vertica on host.
l Kill Vertica process on host.
l Set or reset database parameters.
l Upgrade a license key.
l Manage a cluster.
Rolling Back the Database to the Last Good Epoch
Vertica provides the ability to roll the entire database back to a specific epoch primarily
to assist in the correction of human errors during data loads or other accidental
corruptions. For example, suppose that you have been performing a bulk load and the
cluster went down during a particular COPY command. You might want to discard all
epochs back to the point at which the previous COPY command committed and run the
one that did not finish again. You can determine that point by examining the log files
(see Monitoring the Log Files).
1. On the Advanced Menu, select Roll Back Database to Last Good Epoch.
2. Select the database to roll back. The database must be stopped.
3. Accept the suggested restart epoch or specify a different one.
4. Confirm that you want to discard the changes after the specified epoch.
The database restarts successfully.
Important: The default value of HistoryRetentionTime is 0, which means that
Vertica only keeps historical data when nodes are down. This settings prevents the
use of the Administration Tools 'Roll Back Database to Last Good Epoch' option
because the AHM remains close to the current epoch. Vertica cannot roll back to an
epoch that precedes the AHM.

If you rely on the Roll Back option to remove recently loaded data, consider setting a
day-wide window for removing loaded data. For example:
=> ALTER DATABASE mydb SET HistoryRetentionTime = 86400;
Stopping Vertica on Host
This command attempts to gracefully shut down the Vertica process on a single node.
Caution: Do not use this command if you are intending to shut down the entire
cluster. Use Stop Database instead, which performs a clean shutdown to minimize
data loss.
1. On the Advanced Menu, select Stop Vertica on Host and click OK.
2. Select the hosts to stop.
3. Confirm that you want to stop the hosts.
If the command succeeds View Database Cluster State shows that the selected
hosts are DOWN.

If the command fails to stop any selected nodes, proceed to Killing Vertica Process
on Host.
Killing the Vertica Process on Host
This command sends a kill signal to the Vertica process on a node.
Caution: Do not use this command unless you have already tried Stop Database
and Stop Vertica on Node and both were unsuccessful.
1. On the Advanced menu, select Kill Vertica Process on Host and click OK.
2. Select the hosts on which to kills the Vertica process.
3. Confirm that you want to stop the processes.

4. If the command succeeds, View Database Cluster State shows that the selected
hosts are DOWN.
Upgrading a Vertica License Key
The following steps are for licensed Vertica users. Completing the steps copies a
license key file into the database. See Managing Licenses for more information.
1. On the Advanced menu select Upgrade License Key . Click OK.
2. Select the database for which to upgrade the license key.
3. Enter the absolute pathname of your downloaded license key file (for example,
/tmp/vlicense.dat). Click OK.
4. Click OK when you see a message indicating that the upgrade succeeded.
Note: If you are using Vertica Community Edition, follow the instructions in Vertica
License Renewals or Upgrades for instructions to upgrade to a Vertica Premium
Edition license key.
Managing Clusters
Cluster Management lets you add, replace, or remove hosts from a database cluster.
These processes are usually part of a larger process of adding, removing, or replacing a
database node.

Note: View the database state to verify that it is running. See View Database Cluster
State. If the database isn't running, restart it. See Start the Database.
Using Cluster Management
To use Cluster Management:
1. From the Main Menu, select Advanced Menu, and then click OK.
2. In the Advanced Menu, select Cluster Management, and then click OK.
3. Select one of the following, and then click OK.
n Add Hosts to Database: See Adding Hosts to a Database.
n Re-balance Data: See Rebalancing Data.
n Replace Host: See Replacing Hosts.
n Remove Host from Database: See Removing Hosts from a Database.
Using Administration Tools
The Help Using the Administration Tools command displays a help screen about
using the Administration Tools.
Most of the online help in the Administration Tools is context-sensitive. For example, if
you up the use up/down arrows to select a command, press tab to move to the Help
button, and press return, you get help on the selected command.
Administration Tools Metadata
The Administration Tools configuration data (metadata) contains information that
databases need to start, such as the hostname/IP address of each participating host in
the database cluster.
To facilitate hostname resolution within the Administration Tools, at the command line,
and inside the installation utility, Vertica enforces all hostnames you provide through the
Administration Tools to use IP addresses:
l During installation
Vertica immediately converts any hostname you provide through command line
options --hosts, -add-hosts or --remove-hosts to its IP address equivalent.

n If you provide a hostname during installation that resolves to multiple IP addresses
(such as in multi-homed systems), the installer prompts you to choose one IP
address.
n Vertica retains the name you give for messages and prompts only; internally it
stores these hostnames as IP addresses.
l Within the Administration Tools
All hosts are in IP form to allow for direct comparisons (for example db = database =
database.example.com).
l At the command line
Vertica converts any hostname value to an IP address that it uses to look up the host
in the configuration metadata. If a host has multiple IP addresses that are resolved,
Vertica tests each IP address to see if it resides in the metadata, choosing the first
match. No match indicates that the host is not part of the database cluster.
Metadata is more portable because Vertica does not require the names of the hosts in
the cluster to be exactly the same when you install or upgrade your database.
Writing Administration Tools Scripts
You can invoke most Administration Tools from the command line or a shell script.
Syntax
/opt/vertica/bin/admintools {
{ -h | --help }
| { [--debug ] { -t | --tool } toolname [ tool-args ] }
}
Note: For convenience, add /opt/vertica/bin to your search path.
Parameters
-h
-help
-a
-help_all
commands and options.
[debug] Specifies the tool to run, where toolname is one of the tools

{ -t | -tool }
toolname [args]
listed in the help output described below, and args is one or
more comma-delimited toolname arguments. If you include
the debug option, Vertica logs debug information during tool
execution.
Tools
To return a list of all available tools, enter admintools -h at a command prompt.
Note: To create a database or password, see Creating a Database Name and
Password for naming rules.
To list all available tools and their commands and options in individual help text, enter
admintools -a.
To display help for a specific tool and its options or commands, qualify the specified tool
name with --help or -h, as shown in the example below:
$ admintools -t connect_db --help
Usage: connect_db [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to connect
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes

Operating the Database
This topic explains how to start and stop your Vertica database, and how to use the
database index tool:
l Starting the Database
l Stopping the Database
l CRC and Sort Order Check
Start the Database
Starting a K-safe database is supported when up to K nodes are down or unavailable.
You can start a database using any of these methods:
l The command line
Start the Database Using MC
On MC's Databases and Clusters page, click a database to select it, and click Start
within the dialog box that displays.
Start the Database Using the Administration Tools
1. Open the Administration Tools and select View Database Cluster State to make
sure that all nodes are down and that no other database is running.
3. On the Main Menu, select Start Database,and then select OK.
4. Select the database to start, and then click OK.

Caution: HPE strongly recommends that you start only one database at a time.
If you start more than one database at any time, the results are unpredictable.
Users could encounter resource conflicts or perform operations in the wrong
database.
5. Enter the database password, and then click OK.
6. When prompted that the database started successfully, click OK.
7. Check the log files to make sure that no startup problems occurred.
Start the Database At the Command Line
If you use the admintools command line option, start_db, to start a database, the -p
password argument is only required during database creation, when you install a new
license.
As long as the license is valid, the -p argument is not required to start the database and
is silently ignored, even if you introduce a typo or prematurely press the enter key. This
is by design, as the database can only be started by the user who (as part of the
verticadba UNIX user group) initially created the database or who has root or su
privileges.
If the license were to become invalid, Vertica would use the -p password argument to
attempt to upgrade the license with the license file stored in
/opt/vertica/config/share/license.key.
Following is an example of using start_db on a standalone node:
$ /opt/vertica/bin/admintools -t start_db -d VMartInfo: no password specified, using none
Node Status: v_vmart_node0001: (UP)
Database VMart started successfully

Stopping the Database
Stopping a K-safe database is supported when up to K nodes are down or unavailable.
You can stop a running database using either of these methods:
Note: You cannot stop a running database if any users are connected or Database
Designer is building or deploying a database design.
Stopping a Running Database Using MC
1. Log in to MC as an MC administrator and navigate to the Manage page to make
sure all nodes are up. If a node is down, click that node and select Start node in the
Node list dialog box.
2. Inform all users that have open connections that the database is going to shut down
and instruct them to close their sessions.
Tip: To check for open sessions, query the V_MONITOR.SESSIONS table. The
client_label column returns a value of MC for users who are connected to MC.
3. Still on the Manage page, click Stop in the toolbar.
Stopping a Running Database Using the Administration Tools
1. Use View Database Cluster State to make sure that all nodes are up. If all nodes are
not up, see Restarting a Node.
2. Inform all users that have open connections that the database is going to shut down
and instruct them to close their sessions.
Tip: A simple way to prevent new client sessions from being opened while you are
shutting down the database is to set the MaxClientSessions configuration parameter

to 0. Be sure to restore the parameter to its original setting once you've restarted the
database.
3. Close any remaining user sessions. (Use the CLOSE_SESSION and CLOSE_
ALL_SESSIONS functions.)
5. On the Main Menu, select Stop Database, and then click OK.
6. Select the database you want to stop, and click OK.
7. Enter the password if asked, and click OK.
8. When prompted that the database has been successfully stopped, click OK.
Stopping a Running Database At the Command Line
If you use the admintools command line option, stop_db, to stop a database as follows:
$ /opt/vertica/bin/admintools -t stop_db -d VMart

CRC and Sort Order Check
As a superuser, you can run the Index tool on a Vertica database to perform two tasks:
l Run a per-block cyclic redundancy check (CRC) on data storage to verify data
integrity.
l Check that the sort order in ROS containers is correct.
If the database is down, invoke the Index tool from the Linux command line. If the
database is up, invoke it as an SQL statement from vsql:
Operation Database Down Database Up
Run CRC /opt/vertica/bin/vertica
-D catalog-path -v
select run_index_tool ('checkcrc');
select run_index_tool ('checkcrc',
'true');
Check sort ord
er
/opt/vertica/bin/vertica
-D catalog-path -I
select run_index_tool ('checksort');
select run_index_tool ('checksort',
'true');
If you run the Index tool in vsql as an SQL statement, you can specify that it analyze all
cluster nodes by setting the optional Boolean parameter to true (1). If this parameter is
omitted, the Index tool runs only on the current node.
If invoked from the command line, the Index tool runs only on the current node.
However, the Index tool can run on multiple nodes simultaneously. Invoke the Index tool
binary from the /opt/vertica/bin directory.
Viewing Results
The Index tool writes summary information about its operation to standard output;
detailed information on results is logged in one of two locations, depending on the
environment where you invoke the tool:
l Invoked from the command-line: Results written to indextool.log file in the
database catalog directory.
l Invoked from vsql: Results written to vertica.log on the current node.
Privileges
Restricted to superusers.

Running a Cyclic Reundancy Check
The Index tool can run a cyclic redundancy check (CRC) on each block of existing data
storage to check the data integrity of ROS data blocks.
Running the Tool
You can invoke the Index tool from the command line or from vsql, depending on
whether the database is up or down:
l If the database is down:
Invoke the Index tool from the Linux command line. For example:
dbadmin@localhost bin]$ /opt/vertica/bin/vertica -D /home/dbadmin/VMart/v_vmart_node0001_catalog
-v
The Index tool writes summary information about its operation to standard output, and
logs detailed information in indextool.log in the database catalog directory.
l If the database is up:
Invoke the Index tool it as an SQL statement from vsql with the argument checkcrc.
To run the index tool on all nodes, also set the tool's optional Boolean parameter to
true. If this parameter is omitted, the Index tool runs only on the current node. For
example, the following SQL statement runs a CRC on all cluster nodes:
select run_index_tool ('checkcrc', 'true');
logs detailed information in vertica.log on the current node.
Handling CRC Errors
Vertica evaluates the CRC values in each ROS data block each time it fetches data disk
to process a query. If CRC errors occur while fetching data, the following information is
written to the vertica.log file:
CRC Check Failure Details:File Name:
File Offset:
Compressed size in file:
Memory Address of Read Buffer:
Pointer to Compressed Data:
Memory Contents:

The Event Manager is also notified of CRC errors, so you can use an SNMP trap to
capture CRC errors:
"CRC mismatch detected on file <file_path>. File may be corrupted. Please check hardware and
drivers."
If you run a query from vsql, ODBC, or JDBC, the query returns a FileColumnReader
ERROR. This message indicates that a specific block's CRC does not match a given
record as follows:
hint: Data file may be corrupt. Ensure that all hardware (disk and memory) is working properly.
Possible solutions are to delete the file <pathname> while the node is down, and then allow the
node to recover, or truncate the table data.code: ERRCODE_DATA_CORRUPTED
Checking Sort Order
If ROS data is not sorted correctly in the projection's order, query results that rely on
sorted data will be incorrect. You can use the Index tool to check the ROS sort order if
you suspect or detect incorrect query results. The Index tool evaluates each ROS row to
determine whether it is sorted correctly. If the check locates a row that is not in order, it
writes an error message to the log file with the row number and contents of the unsorted
row.
Running the Tool
You can invoke the Index tool from the command line or from vsql, depending on
whether the database is up or down:
l If the database is down:
Invoke the Index tool from the Linux command line. For example:
$ /opt/vertica/bin/vertica -D /home/dbadmin/VMart/v_vmart_node0001_catalog -I
logs detailed information in indextool.log in the database catalog directory.
l If the database is up:
Invoke the Index tool from vsql as an SQL statement with the argument checkcrc. To
run the index tool on all nodes, also set the tool's optional Boolean parameter to
true. If this parameter is omitted, the Index tool runs only on the current node.
For example, the following SQL statement runs a CRC on all cluster nodes:

select run_index_tool ('checksort', 'true');
logs detailed information in vertica.log on the current node.
Reviewing Errors
1. Open the indextool.log file. For example:
$ cd VMart/v_check_node0001_catalog
2. Look for error messages that include an OID number and the string Sort Order
Violation. For example:
<INFO> ...on oid 45035996273723545: Sort Order Violation:
3. Find detailed information about the sort order violation string by running grep on
indextool.log. For example, the following command returns the line before each
string (-B1), and the four lines that follow (-A4):
[15:07:55][vertica-s1]: grep -B1 -A4 'Sort Order Violation:' /my_host/databases/check/v_check_
node0001_catalog/indextool.log
2012-06-14 14:07:13.686 unknown:0x7fe1da7a1950 [EE] <INFO> An error occurred when running
index tool thread on oid 45035996273723537:
Sort Order Violation:
Row Position: 624
Column Index: 0
Last Row: 2576000
This Row: 2575000
--
2012-06-14 14:07:13.687 unknown:0x7fe1dafa2950 [EE] <INFO> An error occurred when running
index tool thread on oid 45035996273723545:
Sort Order Violation:
Row Position: 3
Column Index: 0
Last Row: 4
This Row: 2
--
4. Find the projection where a sort order violation occurred by querying the storage_
containers system table. Use a storage_oid equal to the OID value listed in
indextool.log. For example:
=> select * from storage_containers where storage_oid = 45035996273723545;

Managing Tables
You can create two types of tables in Vertica, columnar and flexible. Additionally, you
can create either type as persistent or temporary. You can also create views to capture a
specific set of table columns that you query frequently.
Creating Base Tables
The CREATE TABLE statement creates a table in the Vertica logical schema.The
example database described in the Getting Started includes sample SQL scripts that
demonstrate this procedure. For example:
CREATE TABLE vendor_dimension (
vendor_key INTEGER NOT NULL PRIMARY KEY,
vendor_name VARCHAR(64),
vendor_address VARCHAR(64),
vendor_city VARCHAR(64),
vendor_state CHAR(2),
vendor_region VARCHAR(32),
deal_size INTEGER,
last_deal_update DATE
);
Note: Each table can have a maximum 1600 columns.
Creating Tables Using /*+direct*/ Hint
You can use the /*+direct*/ hint to create a table or temporary table. This specifies to
bypass memory (WOS) and save the table directly to disk (ROS). For example, you can
create a table from the table states:
=> select * from states;
State | Bird | Tree | Tax
-------+----------+-------+-----
MA | Robin | Maple | 5.7
NH | Thrush | Elm | 0
NY | Cardinal | Oak | 7.2
(3 rows)
The following CREATE statement includes the /*+direct*/ hint, which must
immediately follow the AS directive:
=> CREATE TABLE StateBird AS /*+direct*/ SELECT State, Bird FROM states;
CREATE TABLE
=> select * from StateBird;
State | Bird
-------+----------
MA | Robin
NH | Thrush

NY | Cardinal
(3 rows)
The following example creates a temporary table with the/*+direct*/ clause. The ON
COMMIT PRESERVE ROWS directive specifies to include all row data in the temporary
table:
=> CREATE TEMP TABLE StateTax ON COMMIT PRESERVE ROWS AS /*+direct*/ SELECT State, Tax FROM states;
CREATE TABLE
=> select * from StateTax;
State | Tax
-------+-----
MA | 5.7
NH | 0
NY | 7.2
(3 rows)
Automatic Projection Creation
properties:

n KSAFE
properties:
n KSAFE

Characteristics of Default Automatic Projections
A default auto-projection has the following characteristics:
l It uses the default Encoding-Type AUTO.
l If created as a result of a CREATE TABLE AS SELECT statement, uses the
encoding specified in the query table.
l Auto-projections use hash segmentation.
l The number of table columns used in the segmentation expression can be
configured, using the MaxAutoSegColumns configuration parameter. See General
Parameters in the Administrator's Guide. Columns are segmented in this order:
n Short (<8 bytes) data type columns first
n Larger (> 8 byte) data type columns
n Up to 32 columns (default for MaxAutoSegColumns configuration parameter)
n If segmenting more than 32 columns, use nested hash function
Created from input
stream (COPY or
INSERT INTO)
Same as
input stream,
if sorted.
the table
Created from CREATE
TABLE AS SELECT
query
Same as
input stream,
if sorted.
output is segmented
PK columns
PK columns
FK constraints only FK first, then Small data type (< 8 byte) columns first,

remaining
columns
No FK or PK
constraints
On all
columns
Default automatic projections and segmentation get your database up and running
quickly. HPE recommends that you start with these projections and then use the
Database Designer to optimize your database further. The Database Designer creates
projections that optimize your database based on the characteristics of the data and,
optionally, the queries you use.
See Also
l Creating External Tables
l CREATE TABLE
Creating a Table Like Another
You can create a table from an existing one using the CREATE TABLE statement with
the LIKE clause. Creating a table with the LIKE option replicates the source table
definition and any storage policy associated with it. CREATE TABLE LIKE copies all
table constraints except foreign key constraints. It does not copy table data or
expressions on columns and constraints.
You can qualify the LIKE clause with the INCLUDING PROJECTIONS clause. This
creates a table and replicates all projections from the source table, excluding pre-join
projections. Vertica follows the same naming conventions as auto projections, while
avoiding name conflicts with existing objects.
Restrictions
The following restrictions apply to the source table:
l It cannot have out-of-date projections.
l It cannot be a temporary table.

VMART=> create table newstates like states including projections;
CREATE TABLE
VMART=> select * from newstates;
State | bird | tree | tax | stateDate
-------+------+------+-----+-----------
(0 rows)
See Also
l Creating Base Tables
l Creating Temporary Tables
l Creating External Tables
l Archiving Partitions
l CREATE TABLE
Creating Temporary Tables
You create temporary tables with CREATE TEMPORARY TABLE, specifying the table as
either local or global. You cannot create temporary external tables.
Temporary tables can be used to divide complex query processing into multiple steps.
Typically, a reporting tool holds intermediate results while reports are generated—for
example, the tool first gets a result set, then queries the result set, and so on. You can
also write Subqueries.
Note: By default, all temporary table data is discarded when a COMMIT statement
ends the current transaction. If CREATE TEMPORARY TABLE includes the parameter
ON COMMIT PRESERVE ROWS, table data is retained until the current session ends.
Global Temporary Tables
Vertica creates global temporary tables in the public schema, with the data contents
private to the transaction or session through which data is inserted.
Global temporary table definitions are accessible to all users and sessions, so that two
(or more) users can access the same global table concurrently. However, whenever a
user commits or rolls back a transaction, or ends the session, Vertica removes the
global temporary table data automatically, so users see only data specific to their own
transactions or session.

Global temporary table definitions persist in the database catalogs until they are
removed explicitly through a DROP TABLE statement.
Local Temporary Tables
Local temporary tables are created in the V_TEMP_SCHEMA namespace and inserted into
the user's search path transparently. Each local temporary table is visible only to the
user who creates it, and only for the duration of the session in which the table is created.
When the session ends, Vertica automatically drops the table definition from the
database catalogs. You cannot preserve non-empty, session-scoped temporary tables
using the ON COMMIT PRESERVE ROWS statement.
Creating local temporary tables is significantly faster than creating regular tables, so you
should make use of them whenever possible.
Note: You cannot add projections to non-empty, session-scoped temporary tables if
you specify ON COMMIT PRESERVE ROWS. Be sure that projections exist before
you load data, as described in the section Automatic Projection Creation in
CREATE TABLE. Also, while you can add projections for tables created with the ON
COMMIT DELETE ROWS option, be aware that you could save the projection but
still lose all the data.
Creating Temporary Tables Using /*+direct*/ Hints
You can use the /*+direct*/ hint to create a table or temporary table. This specifies to
bypass memory (WOS) and save the table directly to disk (ROS). For example, you can
create a table from the table states:
=> select * from states;
State | Bird | Tree | Tax
-------+----------+-------+-----
MA | Robin | Maple | 5.7
NH | Thrush | Elm | 0
NY | Cardinal | Oak | 7.2
(3 rows)
The following CREATE statement includes the /*+direct*/ hint, which must
immediately follow the AS directive:
=> CREATE TABLE StateBird AS /*+direct*/ SELECT State, Bird FROM states;
CREATE TABLE
=> select * from StateBird;
State | Bird
-------+----------
MA | Robin

NH | Thrush
NY | Cardinal
(3 rows)
The following example creates a temporary table with the/*+direct*/ clause. The ON
COMMIT PRESERVE ROWS directive specifies to include all row data in the temporary
table:
=> CREATE TEMP TABLE StateTax ON COMMIT PRESERVE ROWS AS /*+direct*/ SELECT State, Tax FROM states;
CREATE TABLE
=> select * from StateTax;
State | Tax
-------+-----
MA | 5.7
NH | 0
NY | 7.2
(3 rows)
Characteristics of Default Automatic Projections
Vertica creates auto-projections for temporary tables when you load or insert data. The
default auto-projection for a temporary table has the following characteristics:
l It uses the default encoding-type AUTO.
l It is automatically segmented on the table's first several columns.
l Unless the table specifies otherwise, the projection's KSAFE value is set at the current
system K-safety level.
Created from input
stream (COPY or
INSERT INTO)
Same as
input stream,
if sorted.
the table
Created from CREATE
TABLE AS SELECT
query
Same as
input stream,
if sorted.
output is segmented

PK columns
PK columns
FK constraints only FK first, then
remaining
columns
No FK or PK
constraints
On all
columns
As an advanced user, you can modify the default projection created through the CREATE
TEMPORARY TABLE statement by setting one or more of the following parameters:
l column-definition (temp table) (ENCODING encoding-type and ACCESSRANK integer)
l ORDER BY table-column
l Hash-Segmentation-Clause
l UNSEGMENTED { NODE node | ALL NODES }
l NO PROJECTION
Note: Before you define the superprojection in this manner, read Creating Custom
Designs in the Administrator's Guide.
Preserving GLOBAL Temporary Table Data for a Transaction or
Session
You can preserve session-scoped rows in a GLOBAL temporary table for the entire
session or for the current transaction only.
To preserve a temporary table for the transaction, use the ON COMMIT DELETE ROWS
clause:
=> CREATE GLOBAL TEMP TABLE temp_table1 (x NUMERIC, y NUMERIC )
ON COMMIT DELETE ROWS;

To preserve temporary table data until the end of the session, use the ON COMMIT
PRESERVE ROWS clause:
=> CREATE GLOBAL TEMP TABLE temp_table2 (x NUMERIC, y NUMERIC )
ON COMMIT PRESERVE ROWS;
Specifying Column Encoding
You can specify the encoding type to use per column.
The following example specifies that the superprojection created for the temp table use
RLE encoding for the y column:
=> CREATE LOCAL TEMP TABLE temp_table1 (x NUMERIC, y NUMERIC ENCODING RLE )
ON COMMIT DELETE ROWS;
The following example specifies that the superprojection created for the temp table use
the sort order specified by the ORDER BY clause, rather than the order of columns in the
column list.
=> CREATE GLOBAL TEMP TABLE temp_table1 (
x NUMERIC,
y NUMERIC ENCODING RLE,
b VARCHAR(8),
z VARCHAR(8) )
ORDER BY z, x;
See Also
l CREATE TEMPORARY TABLE
l CREATE TABLE
l TRUNCATE TABLE
l DELETE
l ANALYZE_STATISTICS
Creating External Tables
You create an external table using the CREATE EXTERNAL TABLE AS COPY statement.
You cannot create temporary external tables. For the syntax details to create an external
table, see the CREATE EXTERNAL TABLE statement in the SQL Reference Manual.
Note: Each table can have a maximum of 1600 columns.

Required Permissions for External Tables
You must be a database superuser to create external tables.
Permission requirements to use (SELECT from) external tables differ from those of other
tables. By default, once external tables exist, you must also be a database superuser to
access them through a SELECT statement.
To allow users without superuser access to query external tables, an administrator must
create a 'user' storage location and grant those users read access to the location. See
CREATE LOCATION, and GRANT (Storage Location). This location must be a parent
of the path used in the COPY statement when creating the external table.
COPY Statement Definition
When you create an external table, table data is not added to the database, and no
projections are created. Instead, Vertica performs a syntactic check of the CREATE
EXTERNAL TABLE... statement, and stores the table name and COPY statement
definition in the catalog. When a SELECT query references an external table, Vertica
parses and executes the stored COPY statement to obtain the referenced data.
Successfully returning data from an external table requires that the COPY definition be
correct, and that other dependencies, such as files, nodes, and other resources are
accessible and available at query-time.
If the maximum length of a column is smaller than the actual data, such as a
VARCHAR that is too short, Vertica truncates the data and logs the event.
When using the COPY parameter on any node, confirm that the source file definition is
identical on all nodes. Specifying different external files can produce inconsistent
results.
For more information about checking the validity of the external table COPY definition,
see Validating External Tables.
NOT NULL Constraints
Do not specify a NOT NULL column constraint, unless you are certain that the external
data does not contain NULL values. Otherwise, you may see unexpected query
results. For example, a SELECT statement for an external table with a
NOT NULL constraint will reject a column value if it is not NULL.

Canceling the Create Query
Canceling a CREATE EXTERNAL TABLE AS COPY statement can cause
unpredictable results. If you enter a query to create an external table, and it is incorrect
(for example, you inadvertently specify the wrong external location), wait for the query to
complete. When the external table exists, use DROP TABLEs definition.
Developing User-Defined Load (UDL) Functions for External
Tables
You can create external tables with your own load functions. For more information about
developing user-defined load functions, see User Defined Load (UDL) and the extended
COPY syntax in the SQL Reference Manual.
Examples
Examples of external table definitions:
CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat' DELIMITER ',';
CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat.bz2' BZIP DELIMITER ',';
CREATE EXTERNAL TABLE ext1 (x integer, y integer) AS COPY (x as '5', y) FROM '/tmp/ext1.dat.bz2'
BZIP DELIMITER ',';
To allow users without superuser access to use these tables, create a location for 'user'
usage and grant access to it. This example shows granting access to a user named Bob
to any external table whose data is located under /tmp (including in subdirectories to
any depth):
CREATE LOCATION '/tmp' ALL NODES USAGE 'user';
GRANT ALL ON LOCATION '/tmp' to Bob;
See Also
l COPY
l CREATE EXTERNAL TABLE AS COPY
Validating External Tables
When you create an external table, Vertica validates the syntax of the CREATE
EXTERNAL TABLE AS COPY FROM statement. For example, if you omit a required
keyword in the statement (such as FROM), creating the external table fails:
VMart=> create external table ext (ts timestamp,d varchar) as copy '/home/dbadmin/designer.log';
ERROR 2778: COPY requires a data source; either a FROM clause or a WITH SOURCE for a user-defined
source

Checking other components of the COPY definition (such as path statements and node
availability) does not occur until a SELECT query references the external table.
To validate an external table definition, run a SELECT query that references the
external table. Check that the returned query data is what you expect. If the query does
not return data correctly, check the COPY exception and rejected data log files.
Since the COPY definition determines what occurs when you query an external table,
COPY statement errors can reveal underlying problems. For more information about
COPY exceptions and rejections, see Capturing Load Rejections and Exceptions.
Setting Maximum Exceptions
Querying external table data with an incorrect COPY FROM statement definition can
potentially result in many rejected rows. To limit the number of rejections, Vertica sets
the maximum number of retained rejections with the
ExternalTablesExceptionsLimit configuration parameter. The default value is 100.
Setting the ExternalTablesExceptionsLimit to –1 removes the limit, but is not
recommended.
If COPY errors reach the maximum number of rejections, the external table query
continues, but COPY generates a warning in the vertica.log, and does not report
subsequent rejected rows.
Note: Using the ExternalTablesExceptionsLimit configuration parameter
differs from the COPY statement REJECTMAX parameter. The REJECTMAX value
controls how many rejected rows to permit before causing the load to fail. If COPY
encounters a number of rejected rows equal to or greater than REJECTMAX, COPY
aborts execution. A vertica.log warning is not generated if COPY exceeds
REJECTMAX.
Working with External Tables
After creating external tables, you access them as any other table. However, you cannot
perform UPDATE, INSERT, or DELETE operations on external tables.
Managing Resources for External Tables
External tables require minimal additional resources. When you use a select query for
an external table, Vertica uses a small amount of memory when reading external table
data, since the table contents are not part of your database and are parsed each time
the external table is used.

Vertica Does Not Back Up External Tables
Since the data in external tables is managed outside of Vertica, only the external table
definitions, not the data files, are included in database backups. Arrange for a separate
backup process for your external table data.
Using Sequences and Identity Columns in External Tables
The COPY statement definition for external tables can include identity columns and
sequences. Whenever a select statement queries the external table, sequences and
identity columns are re-evaluated. This results in changing the external table column
values, even if the underlying external table data remains the same.
Viewing External Table Definitions
When you create an external table, Vertica stores the COPY definition statement in the
table_definition column of the v_catalog.tables system table.
1. To list all tables, use a select * query, as shown:
select * from v_catalog.tables where table_definition <> '';
2. Use a query such as the following to list the external table definitions (table_
definition):
select table_name, table_definition from v_catalog.tables;
table_name | table_definition
------------+----------------------------------------------------------------------
t1 | COPY FROM 'TMPDIR/external_table.dat' DELIMITER ','
t1_copy | COPY FROM 'TMPDIR/external_table.dat' DELIMITER ','
t2 | COPY FROM 'TMPDIR/external_table2.dat' DELIMITER ','
(3 rows)
External Table DML Support
Following are examples of supported queries, and others that are not:
Supported Unsupported
SELECT * FROM external_table; DELETE FROM external_table WHERE x = 5;
SELECT * FROM external_table where col1=4; INSERT INTO external_table SELECT * FROM ext;
DELETE FROM internal_table WHERE id IN
(SELECT x FROM external_table);
INSERT INTO internal_table
SELECT * FROM external_table;
SELECT * FROM external_table FOR UPDATE;

Using External Table Values
Following is a basic example of how you could use the values of an external table.
1. Create and display the contents of a file with some integer values:
[dbadmin@localhost ~]$ more ext.dat1
2
3
4
5
6
7
8
10
11
12
2. Create an external table pointing at ext.dat:
VMart=> create external table ext (x integer) as copy from '/home/dbadmin/ext.dat';
CREATE TABLE
3. Select the table contents:
VMart=> select * from ext;
x
----
1
2
3
4
5
6
7
8
10
11
12
(11 rows)
4. Perform evaluation on some external table contents:
VMart=> select ext.x, ext.x + ext.x as double_x from ext where x > 5;
x | double_x
----+----------
6 | 12
7 | 14
8 | 16
10 | 20

11 | 22
12 | 24
(6 rows)
5. Create a second table (second), also with integer values:
VMart=> create table second (y integer);
CREATE TABLE
6. Populate the table with some values:
VMart=> copy second from stdin;Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1
>> 1
>> 3
>> 4
>> 5
>> .
7. Join the external table (ext) with the table created in Vertica, called second:
VMart=> select * from ext join second on x=y;
x | y
---+---
1 | 1
1 | 1
3 | 3
4 | 4
5 | 5
(5 rows)
Using External Tables
External tables let you query data stored in files accessible to the Vertica database, but
not managed by it. Creating external tables supplies read-only access through SELECT
queries. You cannot modify external tables through DML commands, such as INSERT,
UPDATE, DELETE, and MERGE.
Using CREATE EXTERNAL TABLE AS COPY Statement
You create external tables with the CREATE EXTERNAL TABLE AS COPY... statement,
shown in this basic example:
CREATE EXTERNAL TABLE tbl(i INT) AS COPY (i) FROM 'path1' ON node1, 'path2' ON node2;

For more details on the supported options to create an external table, see the CREATE
EXTERNAL TABLE statement in the SQL Reference Manual.
The data you specify in the FROM clause of a CREATE EXTERNAL TABLE AS COPY
statement can reside in one or more files or directories, and on one or more nodes. After
successfully creating an external table, Vertica stores the table name and its COPY
definition. Each time a select query references the external table, Vertica parses the
COPY statement definition again to access the data. Here is a sample select statement:
SELECT * FROM tbl WHERE i > 10;
Storing Vertica Data in External Tables
While there are many requirements for you to use external table data, one reason is to
store infrequently-accessed Vertica data on low-cost external media. If external storage
is a goal at your site, the process to accomplish that requires exporting the older data to
a text file, creating a bzip or gzip file of the export data, and saving the compressed file
on an NFS disk. You can then create an external table to access the data any time it is
required.
Calculating Exact Row Count for External Tables
To calculate the exact number of rows in an external table, use ANALYZE_
EXTERNAL_ROW_COUNT. The Optimizer uses this count to optimize for queries that
access external tables.
In particular, if an external table participates in a join, the Optimizer can now identify the
smaller table to be used as the inner input to the join, resulting in better query
performance.
Using External Tables with User-Defined Load (UDL) Functions
You can also use external tables in conjunction with the UDL functions that you create.
For more information about using UDLs, see User Defined Load (UDL) in Extending
Vertica.
Organizing External Table Data
If the data you store in external tables changes regularly (for instance, each month in the
case of storing recent historical data), your COPY definition statement can use
wildcards to make parsing the stored COPY statement definition more dynamic. For
instance, if you store monthly data on an NFS mount, you could organize monthly files
within a top-level directory for a calendar year, such as:

/2012/monthly_archived_data/
In this case, the external table COPY statement will include a wildcard definition such
as the following:
CREATE TABLE archive_data (...) AS COPY FROM '/nfs_name/2012/monthly_archived_data/*'
Whenever a Vertica query references the external table months, and Vertica parses the
COPY statement, all stored data tables in the top-level monthly_archived_data
directory are made accessible to the query.
Managing Table Columns
After you define a table, you can use ALTER TABLE to modify existing table columns.
You can perform the following operations on a column:
l Rename it.
l Change its data type.
l Set its default value.
l Add and remove constraints.
Renaming Columns
You rename a column with ALTER TABLE as follows:
ALTER TABLE [[db-name.]schema.]table-name RENAME [ COLUMN ] column-name TO new-column-name
The following example renames a column in the Retail.Product_Dimension table
from Product_description to Item_description:
=> ALTER TABLE Retail.Product_Dimension
RENAME COLUMN Product_description TO Item_description;
If you rename a column that is referenced by a view, the column does not appear in the
result set of the view even if the view uses the wild card (*) to represent all columns in
the table. Recreate the view to incorporate the column's new name.
.
Changing Column Data Type
You can change a table column's data type for any type whose conversion does not
require storage reorganization. For external tables this is any column, because data

from external tables is not stored in Vertica. For other tables there are additional
restrictions.
Changing Column Types in Tables
You can change a column's data type if doing so does not require storage
reorganization. After you modify a column's data type, data that you load conforms to the
new definition.
For example, the following types are the conversions that Vertica supports:
l Binary types—expansion and contraction but cannot convert between BINARY and
VARBINARY types.
l Character types—all conversions allowed, even between CHAR and VARCHAR
l Exact numeric types—INTEGER, INT, BIGINT, TINYINT, INT8, SMALLINT, and all
NUMERIC values of scale <=18 and precision 0 are interchangeable. For NUMERIC
data types, you cannot alter scale, but you can change the precision in the ranges (0-
18), (19-37), and so on.
Vertica does not allow data type conversion on types that require storage
reorganization:
l Boolean type conversion to other types
l DATE/TIME type conversion
l Approximate numeric type conversions
l Between BINARY and VARBINARY types
You can expand (and shrink) columns within the same class of data type, which is
useful if you want to store longer strings in a column. Vertica validates the data before it
performs the conversion.
For example, if you try to convert a column from varchar(25) to varchar(10) and that
column holds a string with 20 characters, the conversion will fail. Vertica allow the
conversion as long as that column does not have a string larger than 10 characters.
Changing Column Types in External Tables
Because data from external tables is not stored in Vertica, you can change the type of a
column to any other type. No attempt is made to validate the change at the time it is

made. Definitions of external tables are consulted only when the data is read. If Vertica
cannot read the data as the declared type it reports the error as usual.
If you convert a column to a size that is too small for the data, Vertica truncates the data
during the read. For example, if you convert a column from varchar(25) to varchar(10)
and that column holds a string with 20 characters, Vertica reads the first ten and logs a
truncation event.
Examples
The following example expands an existing CHAR column from 5 to 10:
=> CREATE TABLE t (x CHAR, y CHAR(5));
=> ALTER TABLE t ALTER COLUMN y SET DATA TYPE CHAR(10);
=> DROP TABLE t;
This example illustrates the behavior of a changed column's type. First you set column
y's type to VARCHAR(5) and then insert strings with characters that equal 5 and exceed
5:
=> CREATE TABLE t (x VARCHAR, y VARCHAR);
=> ALTER TABLE t ALTER COLUMN y SET DATA TYPE VARCHAR(5);
=> INSERT INTO t VALUES ('1232344455','hello');
OUTPUT
--------
1
(1 row)
=> INSERT INTO t VALUES ('1232344455','hello1');
ERROR 4797: String of 6 octets is too long for type Varchar(5)
=> DROP TABLE t;
You can also contract the data type's size, as long as that altered column contains no
strings greater than 5:
=> CREATE TABLE t (x CHAR, y CHAR(10));
=> ALTER TABLE t ALTER COLUMN y SET DATA TYPE CHAR(5);
=> DROP TABLE t;
You cannot convert types between binary and varbinary. For example, the table
definition below contains two binary columns, so when you try to convert column y to a
varbinary type, Vertica returns a ROLLBACK message:
=> CREATE TABLE t (x BINARY, y BINARY);
=> ALTER TABLE t ALTER COLUMN y SET DATA TYPE VARBINARY;--N
ROLLBACK 2377: Cannot convert column "y" from "binary(1)" to type "varbinary(80)
=> DROP TABLE t;
For external tables you can change a column to any type, including converting from
binary to varbinary:

=> CREATE EXTERNAL TABLE t (a char(10), b binary(20)) AS COPY FROM '...';
=> ALTER TABLE t ALTER COLUMN a SET DATA TYPE long varchar(1000000);
=> ALTER TABLE t ALTER COLUMN b SET DATA TYPE long varbinary(1000000);
Changing the Data Type of a Column in a SEGMENTED BY Clause
If you create a table and do not create a superprojection for it, Vertica automatically
creates a superprojection when you first load data into the table. By default,
superprojections are segmented by all columns to make all data available for queries.
You cannot alter a column used in the superprojection's segmentation clause.
To resize a segmented column, you must either create new superprojections and omit
the column in the segmentation clause or create a new table (with new column size) and
projections.
Illegitimate Column Conversions
The SQL standard disallows an illegitimate column conversion, but you can work
around this restriction if you need to convert data from a non-SQL database. The
following example takes you through the process step by step, where you'll manage
your own epochs.
Given a sales table with columns id (INT) and price (VARCHAR), assume you want
to convert the VARCHAR column to a NUMERIC field. You'll do this by adding a
temporary column whose default value is derived from the existing price column,
rename the column, and then drop the original column.
1. Create the sample table with INTEGER and VARCHAR columns and insert two
rows.
=> CREATE TABLE sales(id INT, price VARCHAR) UNSEGMENTED ALL NODES;
=> INSERT INTO sales VALUES (1, '$50.00');
=> INSERT INTO sales VALUES (2, '$100.00');
2. Commit the transaction:
=> COMMIT;
3. Query the sales table:
=> SELECT * FROM SALES;
id | price
----+---------

1 | $50.00
2 | $100.00
(2 rows)
4. Add column temp_price. This is your temporary column.
=> ALTER TABLE sales ADD COLUMN temp_price NUMERIC DEFAULT SUBSTR(sales.price,
2)::NUMERIC;ALTER TABLE
5. Query the sales table, and you'll see the new temp_price column with its derived
NUMERIC values:
id | price | temp_price
----+---------+---------------------
1 | $50.00 | 50.000000000000000
2 | $100.00 | 100.000000000000000
(2 rows)
6. Drop the default expression from that column.
=> ALTER TABLE sales ALTER COLUMN temp_price DROP DEFAULT;ALTER TABLE
7. Advance the AHM:
=> SELECT advance_epoch(1);
advance_epoch
---------------
New Epoch: 83
8. Manage epochs:
=> SELECT manage_epoch();
manage_epoch
--------------------------------
Current Epoch=83, AHM Epoch=82
9. Drop the original price column.
=> ALTER TABLE sales DROP COLUMN price CASCADE;ALTER COLUMN
10. Rename the new (temporary) temp_price column back to its original name, price:

=> ALTER TABLE sales RENAME COLUMN temp_price to price;
11. Query the sales table one last time:
id | price
----+---------------------
1 | 50.000000000000000
2 | 100.000000000000000
(2 rows)
12. Clean up (drop table sales):
=> DROP TABLE sales;
See ALTER TABLE in the SQL Reference Manual
Setting Column Defaults
A default value can be set for a column of any data type. Vertica computes the default
value for each row. This flexibility is useful for adding a column to a large fact table that
shows another view on the data without having to INSERT .. SELECT a large data set.
You can also alter unstructured tables to use a derived expression as described in
Altering Unstructured Tables.
A column's default value can be any expression that resolves to the column's data. The
expression can reference another column in the same table, or be calculated with a
user-defined function (see Types of UDxs in Extending Vertica).
Caution: Vertica attempts to check the validity of default expressions, but some
errors might not be caught until run time.
Specifying Default Expressions
Default value expressions can specify:
l Other columns of the same table
l Constants
l SQL functions
l Null-handling functions

l User-defined scalar function
l System information functions
l String functions
l Numeric functions
l Formatting functions
l Nested functions
l All Vertica-supported operators
The default expression of an ADD COLUMN statement disallows nested queries or
aggregate functions. Instead, modify an existing column with the ALTER COLUMN clause.
Expression Restrictions
The following restrictions apply to column default expressions:
l Only constant arguments are allowed.
l Subqueries and cross-references to other columns in the table are not supported.
l A default expression cannot return NULL.
l The return value data type matches or can be cast to the column data type.
l Default expressions, when evaluated, conform to the bounds for the column.
l A default expression cannot derive data from another derived column. If one column
has a default derived value expression, another column cannot specify a default that
references the first column.
Volatile Functions as Column Default
You can specify a volatile function as a column default expression using the ALTER
TABLE clause ALTER COLUMN . For example:
ALTER TABLE t ALTER COLUMN a2 SET DEFAULT my_sequence.nextval;
You cannot use a volatile function in the following two scenarios. Attempting to do so
causes a rollback.

l As the default expression for an ALTER TABLE ADD COLUMN statement. For example:
ALTER TABLE t ADD COLUMN a2 INT DEFAULT my_sequence.nextval;
ROLLBACK: VOLATILE functions are not supported in a default expression
ALTER TABLE t ADD COLUMN n2 INT DEFAULT my_sequence.currval;
ALTER TABLE t ADD COLUMN c2 INT DEFAULT RANDOM() + 1;
l As the default expression for an ALTER TABLE ALTER COLUMN statement on an
external table. For example:
ALTER TABLE t ADD COLUMN a2 FLOAT DEFAULT RANDOM();
ROLLBACK 5241: Unsupported access to external table
ALTER TABLE t ALTER COLUMN x SET DEFAULT RANDOM();
ROLLBACK 5241: Unsupported access to external table
Examples
Default Column Value Derived From Another
Column
1. Create a sample table called t with timestamp, integer and varchar(10)
columns:
=> CREATE TABLE t (a TIMESTAMP, b INT, c VARCHAR(10));
CREATE TABLE
=> INSERT INTO t VALUES ('2012-05-14 10:39:25', 2, 'MA');
OUTPUT
--------
1
(1 row)
2. Use ALTER TABLE to add a fourth column that extracts the month from the
timestamp value in column a:
=> ALTER TABLE t ADD COLUMN d INTEGER DEFAULT EXTRACT(MONTH FROM a);
ALTER TABLE
3. Query table t:
=> select * from t;
a | b | c | d

---------------------+---+----+---
2012-05-14 10:39:25 | 2 | MA | 5
(1 row)
Column d returns integer 5 (month of May).
4. View the table to see the new column (d) and its default derived value.
=> d t List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary
Key | Foreign Key
--------+-------+--------+-------------+------+-------------------------+----------+----------
---+-------------
public | t | a | timestamp | 8 | | f | f
|
public | t | b | int | 8 | | f | f
|
public | t | c | varchar(10) | 10 | | f | f
|
public | t | d | int | 8 | date_part('month', t.a) | f | f
|
(4 rows)
Default Column Value Derived From a UDSF
This example shows a user-defined scalar function that adds two integer values. The
function is called add2ints and takes two arguments.
1. Develop and deploy the function, as described in Developing User-Defined
Extensions (UDxs).
2. Create a sample table, t1, with two integer columns:
=> CREATE TABLE t1 ( x int, y int );
CREATE TABLE
3. Insert some values into t1:
=> insert into t1 values (1,2);
OUTPUT
--------
1
(1 row)
=> insert into t1 values (3,4);
OUTPUT
--------
1

(1 row)
4. Use ALTER TABLE to add a column to t1 with the default column value derived
from the UDSF, add2ints:
alter table t1 add column z int default add2ints(x,y);
ALTER TABLE
5. List the new column:
select z from t1;
z
----
3
7
(2 rows)
Altering Table Definitions
Using ALTER TABLE syntax, you can respond to your evolving database schema
requirements. The ability to change the definition of existing database objects facilitates
ongoing maintenance. Furthermore, most of these options are both fast and efficient for
large tables, because they consume fewer resources and less storage than having to
stage data in a temporary table.
ALTER TABLE lets you perform the following table-level changes:
l Add and drop table columns.
l Rename a table.
l Add and drop constraints.
l Alter key constraint enforcement.
l Move a table to a new schema.
l Change a table owner.
l Change and reorganize table partitions.

ALTER TABLE has an ALTER COLUMN clause that lets you modify column definitions—for
example, change their name or data type. For column-level changes, see Managing
Table Columns.
Exclusive ALTER TABLE Clauses
The following ALTER TABLE clauses are exclusive: you cannot combine them with
another ALTER TABLE clause:
l ADD COLUMN
l RENAME COLUMN
l SET SCHEMA
l PARTITION BY
l REORGANIZE
l REMOVE PARTITIONING
l RENAME [TO]
l OWNER TO
Note: You can use the ADD constraints and DROP constraints clauses together.
Using Consecutive ALTER TABLE Commands
With the exception of performing a table rename, complete ALTER TABLE statements
consecutively. For example, to add multiple columns to a table, issue consecutive
ALTER TABLE ADD COLUMN statements.
External Table Restrictions
Not all ALTER TABLE options pertain to external tables. For instance, you cannot add a
column to an external table, but you can rename the table:
=> ALTER TABLE mytable RENAME TO mytable2;
ALTER TABLE
Restoring to an Earlier Epoch
If you restore the database to an epoch that precedes changes to the table definition, the
restore operation reverts the table to the earlier definition. For example, if you change a

column's data type from CHAR(8) to CHAR(16) in epoch 10, and then restore the
database from epoch 5, the column reverts to CHAR(8) data type.
Adding Table Columns
You add a column to a table with the ALTER TABLE clause ADD COLUMN. If qualified with
the keyword CASCADE, Vertica also adds the new table column to all pre-join projections
that are created using that table. If you specify a non-constant default column value and
specify CASCADE, Vertica does not add the column to the pre-join projections.
Restrictions
You cannot add columns to:
l Temporary tables
l Tables that have out-of-date superprojections with up-to-date buddies
Table Locking
When you use ADD COLUMN to alter a table, Vertica takes an O lock on the table until the
operation completes. The lock prevents DELETE, UPDATE, INSERT, and COPY statements
from accessing the table. The lock also blocks SELECT statements issued at
SERIALIZABLE isolation level, until the operation completes. Each table can have a
maximum of 1600 columns.
If you use CASCADE, Vertica also takes O locks on all anchor tables of any pre-join
projections associated with the target table. Consequently, SELECT statements issued
on those tables at SERIALIZABLE isolation level are blocked until the operation
completes.
Adding a column to a table does not affect K-safety of the physical schema design.
You can add columns when nodes are down.
Operations That Occur When Adding Columns
The following operations occur as part of adding columns:
l Inserts the default value for existing rows. For example, if the default expression is
CURRENT_TIMESTAMP, all rows have the current timestamp.
l Automatically adds the new column with a unique projection column name to the
superprojection of the table.
l Populates the column according to the ALTER TABLE ADD COLUMN syntax
(DEFAULT, for example).

See the ALTER TABLE syntax for all ALTER TABLE options.
Adding New Columns to Tables with CASCADE
When you qualify ALTER TABLE..ADD COLUMN with the keyword CASCADE, Vertica adds
the new column to the superprojection and to all pre-join projections that include that
table.
For example, create two tables:
=> CREATE TABLE t1 (x INT PRIMARY KEY NOT NULL, y INT);
=> CREATE TABLE t2 (x INT PRIMARY KEY NOT NULL,
t1_x INT REFERENCES t1(x) NOT NULL, z VARCHAR(8));
After you load data into them, Vertica creates a superprojection for each table. The
superprojections contains all the columns in their respective tables. For this example,
name them super_t1 and super_t2.
Create two pre-join projections that join tables t1 and t2, where t2 is an anchor, or a
fact table, and t1 is a dimension table.
=> CREATE PROJECTION t_pj1 AS SELECT t1.x, t1.y, t2.x, t2.t1_x, t2.z
FROM t1 JOIN t2 ON t1.x = t2.t1_x UNSEGMENTED ALL NODES;
=> CREATE PROJECTION t_pj2 AS SELECT t1.x, t2.x
FROM t1 JOIN t2 ON t1.x = t2.t1_x UNSEGMENTED ALL NODES;
Add a new column w1 to table t1 using the CASCADE keyword. Vertica adds the column
to:
l Superprojection super_t1
l Pre-join projection t_pj1
l Pre-join projection t_pj2
=> ALTER TABLE t1 ADD COLUMN w1 INT DEFAULT 5 NOT NULL CASCADE;
Add a new column w2 to table t1, and specify a non-constant default value. Vertica
adds the new column to the superprojection super_t1. Because the default value is not
a constant, Vertica does not add the new column to the pre-join projections, but displays
a warning instead.
=> ALTER TABLE t1 ADD COLUMN w2 INT DEFAULT (t1.y+1)
NOT NULL CASCADE;
WARNING: Column "w2" in table "t1" with non-constant default
will not be added to prejoin(s) t_pj1, t_pj2.

Updating Associated Table Views
Adding new columns to a table that has an associated view does not update the view's
result set, even if the view uses a wildcard (*) to represent all table columns. To
incorporate new columns, you must recreate the view. See CREATE VIEW in the SQL
Reference Manual.
Dropping Table Columns
When an ALTER TABLE statement includes a DROP COLUMN clause to drop a column,
Vertica drops the specified column from the table and the ROS containers that
correspond to the dropped column.
The syntax looks like this:
ALTER TABLE [[db-name.]schema.]table-name DROP [ COLUMN ] column-name [ CASCADE | RESTRICT ]
After a DROP COLUMN operation completes, data backed up from the current epoch
onward recovers without the column. Data recovered from a backup that precedes the
current epoch re-add the table column. Because drop operations physically purge object
storage and catalog definitions (table history) from the table, AT EPOCH (historical)
queries return nothing for the dropped column.
The altered table retains its object ID.
Note: Drop column operations can be fast because these catalog-level changes do
not require data reorganization, so Vertica can quickly reclaim disk storage.
Restrictions
l You cannot drop or alter a primary key column or a column that participates in the
table partitioning clause.
l You cannot drop the first column of any projection sort order, or columns that
participate in a projection segmentation expression.
l All nodes must be up.
l You cannot drop a column associated with an access policy. Attempts to do so
produce the following error:
ERROR 6482: Failed to parse Access Policies for table "t1"
Using CASCADE to Force a Drop
If the table column to drop has dependencies, you must qualify the DROP COLUMN clause
with the CASCADE option. For example, the target column might be specified in a

projection sort order, or in a pre-join projection. In these and other cases, DROP
COLUMN..CASCADE handles the dependency by reorganizing catalog definitions or
dropping a projection. In all cases, CASCADE performs the minimal reorganization
required to drop the column.
Use CASCADE to drop a column with the following dependencies:
Dropped column
dependency
CASCADE behavior
Any constraint Vertica drops the column when a FOREIGN KEY constraint
depends on a UNIQUE or PRIMARY KEY constraint on the
referenced columns.
Specified in projection
sort order
Vertica truncates projection sort order up to and including the
projection that is dropped without impact on physical storage
for other columns and then drops the specified column. For
example if a projection's columns are in sort order (a,b,c),
dropping column b causes the projection's sort order to be
just (a), omitting column (c).
Specified in one of the
following:
l Pre-join projection
l Projection
segmentation
expression
In both cases, the column to drop is integral to the projection
definition. If possible, Vertica drops the projections as long as
doing so does not compromise K-safety; otherwise, the
transaction rolls back.
For example, a table has multiple projections, where one
projection's segmentation clause specifies the target column.
Vertica tries to drop this projection, unless doing so violates
K-safety. In this case, the transaction rolls back.
Referenced as default
value of another
column
See Dropping a Column Referenced as Default, below.
Dropping a Column Referenced as Default
You might want to drop a table column that is referenced by another column as its
default value. For example, the following table is defined with two columns, a and b:,
where b gets its default value from column a:

=> CREATE TABLE x (a int) UNSEGMENTED ALL NODES;
CREATE TABLE
=> ALTER TABLE x ADD COLUMN b int DEFAULT a;
ALTER TABLE
In this case, dropping column a requires the following procedure:
1. Remove the default dependency through ALTER COLUMN..DROP DEFAULT:
=> ALTER TABLE x ALTER COLUMN b DROP DEFAULT;
2. Create a replacement superprojection for the target table if one or both of the
following conditions is true:
n The target column is the table's first sort order column. If the table has no explicit
sort order, the default table sort order specifies the first table column as the first
sort order column. In this case, the new superprojection must specify a sort order
that excludes the target column.
n If the table is segmented, the target column is specified in the segmentation
expression. In this case, the new superprojection must specify a segmentation
expression that excludes the target column.
Given the previous example, table x has a default sort order of (a,b). Because
column a is the table's first sort order column, you must create a replacement
superprojection that is sorted on column b:
=> CREATE PROJECTION x_p1 as select * FROM x ORDER BY b UNSEGMENTED ALL NODES;
3. Run START_REFRESH:
=> SELECT START_REFRESH();
START_REFRESH
----------------------------------------
Starting refresh background process.
(1 row)
4. Run MAKE_AHM_NOW:
=> SELECT MAKE_AHM_NOW();
MAKE_AHM_NOW
-------------------------------
AHM set (New AHM Epoch: 1231)

(1 row)
5. Drop the column:
=> ALTER TABLE x DROP COLUMN a CASCADE;
.
Vertica implements the CASCADE directive as follows:
l Drops the original superprojection for table x (x_super).
l Updates the replacement superprojection x_p1 by dropping column a.
Examples
The following series of commands successfully drops a BYTEA data type column:
=> CREATE TABLE t (x BYTEA(65000), y BYTEA, z BYTEA(1));
CREATE TABLE
=> ALTER TABLE t DROP COLUMN y;
ALTER TABLE
=> SELECT y FROM t;
ERROR 2624: Column "y" does not exist
=> ALTER TABLE t DROP COLUMN x RESTRICT;
ALTER TABLE
=> SELECT x FROM t;
ERROR 2624: Column "x" does not exist
=> SELECT * FROM t;
z
---
(0 rows)
=> DROP TABLE t CASCADE;
DROP TABLE
The following series of commands tries to drop a FLOAT(8) column and fails because
there are not enough projections to maintain K-safety.
=> CREATE TABLE t (x FLOAT(8),y FLOAT(08));
CREATE TABLE
=> ALTER TABLE t DROP COLUMN y RESTRICT;
ALTER TABLE
=> SELECT y FROM t;
ERROR 2624: Column "y" does not exist
=> ALTER TABLE t DROP x CASCADE;
ROLLBACK 2409: Cannot drop any more columns in t
=> DROP TABLE t CASCADE;
Altering Key Constraint Enforcement
To alter how Vertica enforces constraints on a table key, use the ALTER TABLE clause
ALTER CONSTRAINT. You can optionally qualify this clause with the keywords ENABLED

or DISABLED:
l ENABLED automatically enforces a PRIMARY or UNIQUE key constraint.
l DISABLED prevents automatic enforcement of a PRIMARY or UNIQUE key constraint.
If you omit ENABLED or DISABLED, Vertica checks two configuration parameters to
determine whether key constraints are automatically enabled:
l EnableNewPrimaryKeysByDefault
l EnableNewUniqueKeysByDefault
For more information on these configuration parameters, see Enforcing Primary and
Unique Key Constraints Automatically
If you disable automatic enforcement of PRIMARY or UNIQUE key constraints, you can
instead run ANALYZE_CONSTRAINTS to verify that columns have unique values after
running a DML command or bulk loading.
See About Constraints for general information about constraints.
Renaming Tables
The ALTER TABLE RENAME TO statement lets you rename one or more tables.
Renaming tables does not affect existing pre-join projections because pre-join
projections refer to tables by their unique numeric object IDs (OIDs). Renaming tables
also does not change the table OID.
To rename two or more tables:
1. List the tables to rename with a comma-delimited list, specifying a schema-name
after part of the table specification only before the RENAME TO clause:
=> ALTER TABLE S1.T1, S1.T2 RENAME TO U1, U2;
The statement renames the listed tables to their new table names from left to right,
matching them sequentially, in a one-to-one correspondence.

The RENAME TO parameter is applied atomically so that all tables are renamed, or
none of the tables is renamed. For example, if the number of tables to rename does
not match the number of new names, none of the tables is renamed.
2. Do not specify a schema-name as part of the table specification after the RENAME
TO clause, since the statement applies to only one schema. The following example
generates a syntax error:
=> ALTER TABLE S1.T1, S1.T2 RENAME TO S1.U1, S1.U2;
Note: Renaming a table referenced by a view causes the view to fail, unless you
create another table with the previous name to replace the renamed table.
Using Rename to Swap Tables Within a Schema
You can use the ALTER TABLE RENAME TO statement to swap tables within a
schema without actually moving data. You cannot swap tables across schemas.
To swap tables within a schema (example statement is split to explain steps):
1. Enter the names of the tables to swap, followed by a new temporary table
placeholder (temps):
=> ALTER TABLE T1, T2, temps
2. Use the RENAME TO clause to swap the tables: T1 to temps, T2 to T1, and temps
to T2:
RENAME TO temps, T1, T2;
Moving Tables to Another Schema
The ALTER TABLE clause SET SCHEMA moves a table from one schema to another.
Moving a table requires that you have USAGE privileges on the current schema and
CREATE privileges on destination schema. You can move only one table between
schemas at a time. You cannot move temporary tables between schemas.
SET SCHEMA can be qualified by one of the following options:

l CASCADE, the default, automatically moves all projections that are anchored on the
source table to the destination schema, regardless of the schema in which the
projections reside.
l RESTRICT moves only projections that are anchored on the source table and also
reside in the same schema.
Name Conflicts
If a table of the same name or any of the projections that you want to move already exist
in the new schema, the statement rolls back and does not move either the table or any
projections. To work around name conflicts:
1. Rename any conflicting table or projections that you want to move.
2. Run the ALTER TABLE SET SCHEMA statement again.
Note: Vertica lets you move system tables to system schemas. Moving system
tables could be necessary to support designs created through the Database
Designer.
Example
The following example moves table T1 from schema S1 to schema S2. SET SCHEMA
defaults to CASCADE. Thus, all the projections that are anchored on table T1 are
automatically moved to schema S2 regardless of the schema in which they reside:
=> ALTER TABLE S1.T1 SET SCHEMA S2;
Changing Table Ownership
The ability to change table ownership is useful when moving a table from one schema
to another. Ownership reassignment is also useful when a table owner leaves the
company or changes job responsibilities. Because you can change the table owner, the
tables won't have to be completely rewritten, you can avoid loss in productivity.
The syntax is:
ALTER TABLE [[db-name.]schema.]table-name OWNER TO new-owner-name
In order to alter table ownership, you must be either the table owner or a superuser.
A change in table ownership transfers just the owner and not privileges; grants made by
the original owner are dropped and all existing privileges on the table are revoked from
the previous owner. However, altering the table owner transfers ownership of dependent

sequence objects (associated IDENTITY/AUTO-INCREMENT sequences) but does not
transfer ownership of other referenced sequences. See ALTER SEQUENCE for details
on transferring sequence ownership.
Notes
l Table privileges are separate from schema privileges; therefore, a table privilege
change or table owner change does not result in any schema privilege change.
l Because projections define the physical representation of the table, Vertica does not
require separate projection owners. The ability to create or drop projections is based
on the table privileges on which the projection is anchored.
l During the alter operation Vertica updates projections anchored on the table owned
by the old owner to reflect the new owner. For pre-join projection operations, Vertica
checks for privileges on the referenced table.
Example
In this example, user Bob connects to the database, looks up the tables, and transfers
ownership of table t33 from himself to to user Alice.
=> c - BobYou are now connected as user "Bob".
=> d
--------+--------+-------+---------+---------
(2 rows)
ALTER TABLE
Notice that when Bob looks up database tables again, he no longer sees table t33.
=> d List of tables
List of tables
--------+--------+-------+---------+---------
(1 row)
When user Alice connects to the database and looks up tables, she sees she is the
owner of table t33.
=> c - AliceYou are now connected as user "Alice".
=> d
List of tables
--------+------+-------+-------+---------

public | t33 | table | Alice |
(2 rows)
Either Alice or a superuser can transfer table ownership back to Bob. In the following
case a superuser performs the transfer.
ALTER TABLE
=> d
List of tables
--------+----------+-------+---------+---------
public | comments | table | dbadmin |
s1 | t1 | table | User1 |
(4 rows)
You can also query the V_CATALOG.TABLES system table to view table and owner
information. Note that a change in ownership does not change the table ID.
In the below series of commands, the superuser changes table ownership back to Alice
and queries the TABLES system table.
ALTER TABLE
table_schema_id | table_schema | table_id | table_name | owner_id | owner_name
-------------------+--------------+-------------------+------------+-------------------+-----------
-
45035996273730528 | s1 | 45035996273730548 | t1 | 45035996273730516 | User1
45035996273704968 | public | 45035996273795846 | t33 | 45035996273724576 | Alice
(5 rows)
Now the superuser changes table ownership back to Bob and queries the TABLES
table again. Nothing changes but the owner_name row, from Alice to Bob.
ALTER TABLE
table_schema_id | table_schema | table_id | table_name | owner_id | owner_
name-------------------+--------------+-------------------+------------+-------------------+-------
-----
45035996273730528 | s1 | 45035996273730548 | t1 | 45035996273730516 | User1
45035996273704968 | public | 45035996273793876 | foo | 45035996273724576 | Alice
45035996273704968 | public | 45035996273795846 | t33 | 45035996273714428 | Bob
(5 rows)

Table Reassignment with Sequences
Altering the table owner transfers ownership of only associated IDENTITY/AUTO-
INCREMENT sequences but not other reference sequences. For example, in the below
series of commands, ownership on sequence s1 does not change:
=> CREATE USER u1;
CREATE USER
=> CREATE USER u2;
CREATE USER
=> CREATE SEQUENCE s1 MINVALUE 10 INCREMENT BY 2;
CREATE SEQUENCE
CREATE TABLE
CREATE TABLE
---------------+------------
s1 | dbadmin
(1 row)
ALTER TABLE
---------------+------------
s1 | dbadmin
(1 row)
ALTER TABLE
---------------+------------
s1 | dbadmin
(1 row)
See Also
l Changing Sequence Ownership

Using Named Sequences
Named sequences are database objects that generate unique numbers in ascending or
descending sequential order. They are most often used when an application requires a
unique identifier in a table or an expression. Once a named sequence returns a value, it
never returns that same value again in the same session. Named sequences are
independent objects. While you can use their values in tables, they are not subordinate
to the tables in which you use the named sequences.
Types of Incrementing Values
In addition to named sequences, Vertica supports two other kinds of incrementing
values:
l Auto-increment column value: The most basic incrementing numeric column type.
The database increments this value each time you add a row to the table. You cannot
change the value of an AUTO_INCREMENT column, or its amount of cache, which is
1K.
l Identity column: A numeric column type that the database increments automatically.
Auto-increment and Identity sequences are defined through column constraints in the
CREATE TABLE statement and are incremented each time a row is added to the table.
Both of these object types are table-dependent and do not persist independently. The
identity value is never rolled back even if a transaction that tries to insert a value is not
committed. The LAST_INSERT_ID function returns the last value generated for an auto-
increment or identity column.
Each type of sequence value has a set of properties. A named sequence has the most
properties, and an Auto-increment sequence the least. The following table lists the
differences between the three sequence values:
Behavior Named
Sequence
Identity Auto-
increment
Default cache value 250K X X
Default cache value 1K X
Set initial cache X X

Behavior Named
Sequence
Identity Auto-
increment
Define start value X X
Specify increment unit X X
Exists as an independent
object
X
Exists only as part of table X X
Create as column constraint X X
Always starts at 1 X
Requires name X
Use in expressions X
Unique across tables X
Change parameters X
Move to different schema X
Set to increment or
decrement
X
Grant privileges to object X
Specify minimum value X
Specify maximum value X
While sequence object values are guaranteed to be unique, they are not guaranteed to
be contiguous. Since sequences are not necessarily contiguous, you may interpret the
returned values as missing. For example, two nodes can increment a sequence at
different rates. The node with a heavier processing load increments the sequence, but
the values are not contiguous with those being incremented on a node with less
processing.

Using a Named Sequence with an Auto_Increment or Identity
Column
Each table can contain only one auto_increment or identity column. A table with
either an auto_increment or identity column can also contain a named sequence.
The next example illustrates this, where table test2 contains a named sequence (my_
seq), and an auto_increment value for the column last):
VMart=> CREATE TABLE test2 (id INTEGER NOT NULL UNIQUE,
middle INTEGER DEFAULT NEXTVAL('my_seq'),
next INT, last auto_increment);
CREATE TABLE
Named Sequence Functions
When you create a named sequence object, you can also specify its increment or
decrement value. The default is 1. Use these functions with named sequences:
l NEXTVAL — Advances the sequence and returns the next value. The value is
incremented for ascending sequences and decremented for descending. The first
time you call NEXTVAL after creating a named sequence, the function sets up the
default or specified amount of cache on each cluster node. From its cache store, each
node returns either the default sequence value, or a start number you specified with
CREATE SEQUENCE.
l CURRVAL — Returns the LAST value that the previous invocation of NEXTVAL
returned in the current session. If there were no calls to NEXTVAL after creating a
sequence, the CURRVAL function returns an error:
dbt=> create sequence seq2;
CREATE SEQUENCE
dbt=> select currval('seq2');
ERROR 4700: Sequence seq2 has not been accessed in the session
You can use the NEXTVAL and CURRVAL functions in INSERT and COPY
expressions.
Using DDL Commands and Functions With Named Sequences
For details, see the following related statements and functions in the SQL Reference
Manual:

Use this
statement...
To...
CREATE
SEQUENCE
Create a named sequence object.
ALTER
SEQUENCE
Alter named sequence parameters, rename a sequence within a
schema, or move a named sequence between schemas.
DROP
SEQUENCE
Remove a named sequence object.
GRANT
SEQUENCE
Grant user privileges to a named sequence object. See also
Sequence Privileges.
Creating Sequences
Create a sequence using the CREATE SEQUENCE statement. All of the parameters
(besides a sequence name) are optional.
The following example creates an ascending named sequence, my_seq, starting at the
value 100:
dbt=> create sequence my_seq START 100;
CREATE SEQUENCE
After creating a sequence, you must call the NEXTVAL function at least once in a
session to create a cache for the sequence and its initial value. Subsequently, use
NEXTVAL to increment the sequence value. Use the CURRVAL function to get the
current value.
The following NEXTVAL function instantiates the newly-created my_seq sequence and
sets its first number:
=> SELECT NEXTVAL('my_seq');
nextval
---------
100
(1 row)
If you call CURRVAL before NEXTVAL, the system returns an error:

dbt=> SELECT CURRVAL('my_seq');
ERROR 4700: Sequence my_seq has not been accessed in the session
The following command returns the current value of this sequence. Since no other
operations have been performed on the newly-created sequence, the function returns
the expected value of 100:
=> SELECT CURRVAL('my_seq');
currval
---------
100
(1 row)
The following command increments the sequence value:
nextval
---------
101
(1 row)
Calling the CURRVAL again function returns only the current value:
=> SELECT CURRVAL('my_seq');
currval
---------
101
(1 row)
The following example shows how to use the my_seq sequence in an INSERT
statement.
=> CREATE TABLE customer (
lname VARCHAR(25),
fname VARCHAR(25),
membership_card INTEGER,
id INTEGER
);
=> INSERT INTO customer VALUES ('Hawkins' ,'John', 072753, NEXTVAL('my_seq'));
Now query the table you just created to confirm that the ID column has been
incremented to 102:
=> SELECT * FROM customer;
lname | fname | membership_card | id
---------+-------+-----------------+-----
Hawkins | John | 72753 | 102
(1 row)

The following example shows how to use a sequence as the default value for an
INSERT command:
=> CREATE TABLE customer2(
id INTEGER DEFAULT NEXTVAL('my_seq'),
lname VARCHAR(25),
fname VARCHAR(25),
membership_card INTEGER
);
=> INSERT INTO customer2 VALUES (default,'Carr', 'Mary', 87432);
Now query the table you just created. The ID column has been incremented again to
103:
=> SELECT * FROM customer2;
id | lname | fname | membership_card
-----+-------+-------+-----------------
103 | Carr | Mary | 87432
(1 row)
Distributing Named Sequences
When you create a named sequence, the CACHE parameter determines the number of
sequence values each node maintains during a session. The default cache value is
250K, so each node reserves 250,000 values per session for each sequence. The
default cache size provides an efficient means for large insert or copy operations.
Specifying a smaller number of cache values can impact performance of large loads,
since Vertica must create a new set of cache values whenever more are required.
Getting more cache for a new set of sequence values requires Vertica to perform a
catalog lock. Such locks can adversely affect database performance, since some
activities, such as data inserts, cannot occur until Vertica releases the lock.
Effects of Distributed Sessions
Vertica distributes a session across all nodes. After you create a named sequence, the
first time any cluster node executes a NEXTVAL() statement within a query, the node
reserves its own cache of sequence values. The node then maintains that set of values
for the current session. Other nodes executing a NEXTVAL() statement create and
maintain their own cache of sequence values.
During a session, nodes can increment sequence values from NEXTVAL() statements at
different rates. This behavior results in the sequences from a NEXTVAL statement on
one node not being sequential with sequence values from another node. Each
sequence is guaranteed to be unique, but can be out of order with a
NEXTVAL statement executed on another node. Regardless of the number of calls to

NEXTVAL and CURRVAL, Vertica increments a sequence only once per row. If
multiple calls to NEXTVAL occur in the same row, the statement returns the same value.
If sequences are used in join statements, Vertica increments a sequence once for each
composite row output by the join.
Calculating Named Sequences
Vertica calculates the current value of a sequence as follows:
l At the end of every statement, the state of all sequences used in the session is
returned to the initiator node.
l The initiator node calculates the maximum CURRVAL of each sequence across all
states on all nodes.
l This maximum value is used as CURRVAL in subsequent statements until another
NEXTVAL is invoked.
Losing Sequence Values
Sequence values in cache can be lost in the following situations:
l If a statement fails after NEXTVAL is called (thereby consuming a sequence value
from the cache), the value is lost.
l If a disconnect occurs (for example, dropped session), any remaining values in cache
that have not been returned through NEXTVAL are lost.
To recover lost sequence values, you can run an ALTER SEQUENCE command to
define a new sequence number generator, which resets the counter to the correct value
in the next session.
Note: When Elastic Cluster is enabled, creating a projection segmented with
modular hash uses hash segmentation instead.
How Sequences Behave Across Nodes
This section presents sequence behavior across Vertica nodes.
The following example illustrates sequence distribution on a 3-node cluster, with
node01 as the initiator node.
1. Create a table called dist:

CREATE TABLE dist (i INT, j VARCHAR);
Create a projection called oneNode and segment by column i on node01:
CREATE PROJECTION oneNode AS SELECT * FROM dist SEGMENTED BY i NODES node01;
Create a second projection called twoNodes and segment column x by hash on node02
and node03:
CREATE PROJECTION twoNodes AS SELECT * FROM dist SEGMENTED BY HASH(i) NODES node02, node03;
Create a third projection called threeNodes and segment column i by hash on all nodes
(1-3):
CREATE PROJECTION threeNodes as SELECT * FROM dist SEGMENTED BY HASH(i) ALL NODES;
Insert some values:
COPY dist FROM STDIN;
1|ONE
2|TWO
3|THREE
4|FOUR
5|FIVE
6|SIX
.
Query the STORAGE_CONTAINERS table to check the projections on each node:
SELECT node_name, projection_name, total_row_count FROM storage_containers;
node_name | projection_name | total_row_count
-----------+-----------------+---------------
node0001 | oneNode | 6 --Contains rows with i=(1,2,3,4,5,6)
node0001 | threeNodes | 2 --Contains rows with i=(3,6)
node0002 | twoNodes | 3 --Contains rows with i=(2,4,6)
node0003 | twoNodes | 3 --Contains rows with i=(1,3,5)
(6 rows)
Query the segmentation of rows for projection oneNode:
1 ONE Node01
2 TWO Node01
3 THREE Node01
4 FOUR Node01
5 FIVE Node01
6 SIX Node01
Query the segmentation of rows for projection twoNodes:

1 ONE Node03
2 TWO Node02
3 THREE Node03
4 FOUR Node02
5 FIVE Node03
6 SIX Node02
Query the segmentation of rows for projection threeNodes:
1 ONE Node02
2 TWO Node03
3 THREE Node01
4 FOUR Node02
5 FIVE Node03
6 SIX Node01
Create a sequence and specify a cache of 10. The sequence will cache 10 values in
memory. The minimum cache size is 1 for the CREATE SEQUENCE statement,
indicating that only one value can be generated at a time, and no cache is in use.
Example 2: Create a sequence named s1 and specify a cache of 10:
CREATE SEQUENCE s1 cache 10;
SELECT s1.nextval, s1.currval, s1.nextval, s1.currval, j FROM oneNode;
nextval | currval | nextval | currval | j
---------+---------+---------+---------+-------
1 | 1 | 1 | 1 | ONE
2 | 2 | 2 | 2 | TWO
3 | 3 | 3 | 3 | THREE
4 | 4 | 4 | 4 | FOUR
5 | 5 | 5 | 5 | FIVE
6 | 6 | 6 | 6 | SIX
(6 rows)
The following table illustrates the current state of the sequence for that session. It holds
the current value, values remaining (the difference between the current value (6) and the
cache (10)), and cache activity. There is no cache activity on node02 or node03.
Sequence Cache State Node01 Node02 Node03
Current value 6 NO CACHE NO CACHE
Remainder 4 NO CACHE NO CACHE
Example 3: Return the current values from twoNodes:
SELECT s1.currval, j FROM twoNodes;
currval | j
---------+-------
6 | ONE

Example 6: The following command runs on node02 only:
SELECT s1.nextval, j FROM twoNodes WHERE i = 2;
nextval | j
---------+-----
103 | TWO
(1 row)
Remaining 4 7 8
Example 7: The following command gets the current value from twoNodes:
currval | j
---------+-------
103 | ONE
103 | TWO
103 | THREE
103 | FOUR
103 | FIVE
103 | SIX
(6 rows)
Example 8: This example assume that node02 holds the cache before node03:
SELECT s1.nextval, j FROM twoNodes;
nextval | j
---------+-------
203 | ONE
104 | TWO
204 | THREE
105 | FOUR
205 | FIVE
106 | SIX
(6 rows)
Remaining 4 6 5
Example 9: The following command calls the current value from oneNode:

currval | j
---------+-------
205 | ONE
205 | TWO
205 | THREE
205 | FOUR
205 | FIVE
205 | SIX
(6 rows)
Example 10: This example calls the NEXTVAL function on oneNode:
SELECT s1.nextval, j FROM oneNode;
nextval | j
---------+-------
7 | ONE
8 | TWO
9 | THREE
10 | FOUR
301 | FIVE
302 | SIX
(6 rows)
Remaining 8 4 5
Example 11: In this example, twoNodes is the outer table and threeNodes is the inner
table to a merge join. The threeNodes node is resegmented as per twoNodes.
SELECT s1.nextval, j FROM twoNodes JOIN threeNodes ON twoNodes.i = threeNodes.i;
SELECT s1.nextval, j FROM oneNode;
nextval | j
---------+-------
206 | ONE
107 | TWO
207 | THREE
108 | FOUR
208 | FIVE
109 | SIX
(6 rows)

Remaining 8 1 2
Example 12: This next example shows how sequences work with buddy projections.
--Same session
DROP TABLE t CASCADE;
CREATE TABLE t (i INT, j varchar(20));
CREATE PROJECTION threeNodes AS SELECT * FROM t
SEGMENTED BY HASH(i) ALL NODES KSAFE 1;
COPY t FROM STDIN;
1|ONE
2|TWO
3|THREE
4|FOUR
5|FIVE
6|SIX
.
SELECT node_name, projection_name, total_row_count FROM storage_containers;
node_name | projection_name | total_row_count
-----------+-----------------+-----------------
node01 | threeNodes_b0 | 2
(6 rows)
The following function call assumes that node02 is down. It is the same session.
Node03 takes up the work of node02:
SELECT s1.nextval, j FROM t;
nextval | j
---------+-------
401 | ONE
402 | TWO
305 | THREE
403 | FOUR
404 | FIVE
306 | SIX
(6 rows)
Remaining 4 0 6

Example 13: This example starts a new session.
DROP TABLE t CASCADE;
CREATE TABLE t (i INT, j VARCHAR);
CREATE PROJECTION oneNode AS SELECT * FROM t SEGMENTED BY i NODES node01;
CREATE PROJECTION twoNodes AS SELECT * FROM t SEGMENTED BY HASH(i) NODES node02, node03;
CREATE PROJECTION threeNodes AS SELECT * FROM t SEGMENTED BY HASH(i) ALL NODES;
INSERT INTO t values (nextval('s1'), 'ONE');
SELECT * FROM t;
i | j
-----+-------
501 | ONE
(1 rows)
Current value 501 NO CACHE NO CACHE
Remaining 9 0 0
Example 14:
INSERT INTO t SELECT s1.nextval, 'TWO' FROM twoNodes;
SELECT * FROM t;
i | j
-----+-------
501 | ONE --stored in node01 for oneNode, node02 for twoNodes, node02 for threeNodes
601 | TWO --stored in node01 for oneNode, node03 for twoNodes, node01 for threeNodes
(2 rows)
Current value 501 601 NO CACHE
Remaining 9 9 0
Example 15:
INSERT INTO t select s1.nextval, 'TRE' from threeNodes;
SELECT * FROM t;
i | j
-----+-------
502 | TRE --stored in node01 for oneNode, node03 for twoNodes, node03 for threeNodes
(4 rows)

Remaining 9 9 0
Example 16:
INSERT INTO t SELECT s1.currval, j FROM threeNodes WHERE i != 502;
SELECT * FROM t;
i | j
-----+-------
(7 rows)
Remaining 9 9 0
Example 17:
INSERT INTO t VALUES (s1.currval + 1, 'QUA');
SELECT * FROM t;
i | j
-----+-------
603 | QUA
(8 rows)
Remaining 9 9 0

See Also
l Sequence Privileges
l ALTER SEQUENCE
l CREATE TABLE
l Column-Constraint
l CURRVAL
l DROP SEQUENCE
l GRANT (Sequence)
l NEXTVAL
Loading Sequences
You can use a sequence as part of creating a table. The sequence must already exist,
and have been instantiated using the NEXTVAL statement.
Creating and Instantiating a Sequence
The following example creates an ascending named sequence, my_seq, starting at the
value 100:
dbt=> create sequence my_seq START 100;
CREATE SEQUENCE
After creating a sequence, you must call the NEXTVAL function at least once in a
session to create a cache for the sequence and its initial value. Subsequently, use
NEXTVAL to increment the sequence value. Use the CURRVAL function to get the
current value.
The following NEXTVAL function instantiates the newly-created my_seq sequence and
sets its first number:
nextval
---------
100
(1 row)
If you call CURRVAL before NEXTVAL, the system returns an error:

dbt=> SELECT CURRVAL('my_seq');
ERROR 4700: Sequence my_seq has not been accessed in the session
Using a Sequence in an INSERT Command
Update sequence number values by calling the NEXTVAL function, which
increments/decrements the current sequence and returns the next value. Use
CURRVAL to return the current value. These functions can also be used in INSERT and
COPY expressions.
The following example shows how to use a sequence as the default value for an
INSERT command:
CREATE TABLE customer2( ID INTEGER DEFAULT NEXTVAL('my_seq'),
lname VARCHAR(25),
fname VARCHAR(25),
membership_card INTEGER
);
INSERT INTO customer2 VALUES (default,'Carr', 'Mary', 87432);
Now query the table you just created. The column named ID has been incremented by
(1) again to 104:
SELECT * FROM customer2;
ID | lname | fname | membership_card
-----+-------+-------+-----------------
104 | Carr | Mary | 87432
(1 row)
Altering Sequences
The ALTER SEQUENCE statement lets you change the attributes of a previously-
defined named sequence. Changes take effect in the next database session. Any
parameters not specifically set in the ALTER SEQUENCE command retain their
previous settings.
The ALTER SEQUENCE statement lets you rename an existing sequence, or the
schema of a sequence, but you cannot combine either of these changes with any other
optional parameters.
Note: Using ALTER SEQUENCE to set a START value below the CURRVAL can
result in duplicate keys.
Examples
The following example modifies an ascending sequence called my_seq to start at 105:
=>ALTER SEQUENCE my_seq RESTART WITH 105;

The following example moves a sequence from one schema to another:
=>ALTER SEQUENCE [public.]my_seq SET SCHEMA vmart;
The following example renames a sequence in the Vmart schema:
=>ALTER SEQUENCE [vmart.]my_seq RENAME TO serial;
Remember that changes occur only after you start a new database session. For
example, if you create a named sequence my_sequence, starting at value 10, each time
you call NEXTVAL(), you increment the value by 1, as in the following series of
commands:
=>CREATE SEQUENCE my_sequence START 10;
=>SELECT NEXTVAL('my_sequence');
nextval
---------
10
(1 row)
nextval
---------
11
(1 row)
Next, issue the ALTER SEQUENCE statement to assign a new value starting at 50:
=>ALTER SEQUENCE my_sequence RESTART WITH 50;
When you call the NEXTVAL function, the sequence is incremented again by 1 value:
NEXTVAL
---------
12
(1 row)
The sequence starts at 50 only after restarting the database session:
NEXTVAL
---------
50
(1 row)
Changing Sequence Ownership
The ALTER SEQUENCE command lets you change the attributes of an existing sequence.
All changes take effect immediately, within the same session. Any parameters not set
during an ALTER SEQUENCE statement retain their prior settings.

If you need to change sequence ownership, such as if an employee who owns a
sequence leaves the company, you can do so with the following ALTER SEQUENCE
syntax:
=> ALTER SEQUENCE sequence-name OWNER TO new-owner-name;
This operation immediately reassigns the sequence from the current owner to the
specified new owner.
Only the sequence owner or a superuser can change ownership, and reassignment
Note: Renaming a table owner transfers ownership of dependent sequence objects
(associated IDENTITY/AUTO-INCREMENT sequences) but does not transfer
ownership of other referenced sequences. See Changing Table Ownership.
Example
The following example reassigns sequence ownership from the current owner to user
Bob:
=> ALTER SEQUENCE sequential OWNER TO Bob;
See ALTER SEQUENCE in the SQL Reference Manual for details.
Dropping Sequences
Use the DROP SEQUENCE function to remove a sequence. You cannot drop a
sequence:
l If other objects depend on the sequence. The CASCADE keyword is not supported.
l That is used in the default expression of a column until all references to the sequence
are removed from the default expression.
Example
The following command drops the sequence named my_sequence:
=> DROP SEQUENCE my_sequence;
Synchronizing Table Data with MERGE
The MERGE statement combines INSERT and UPDATE operations as a single operation.
During a merge, Vertica updates and inserts rows into one table from rows in another.

Required Arguments
MERGE statements require the following arguments:
l Target table to update.
l Source table that contains the new and/or changed rows to merge into the target
table.
l A join (ON) clause to match source and target table rows for update and insert
operations.
Filter Clauses
MERGE statements can optionally specify one or both of the following filters:
l WHEN MATCHED THEN UPDATE SET: Vertica updates and/or deletes existing rows in
the target table with data from the source table. Only one WHEN MATCHED THEN
UPDATE SET clause is permitted per MERGE statement.
l WHEN NOT MATCHED THEN INSERT: Vertica inserts into the target table all rows from
the source table that do not match any rows in the target table. Only one WHEN NOT
MATCHED THEN INSERT clause is permitted per MERGE statement.
A MERGE statement must include at least one of these clauses to be eligible for an
optimized query plan. For details, see Optimized Versus Non-Optimized MERGE.
For an example that shows how to use these filters, see MERGE Example.
WHEN MATCHED THEN UPDATE SET
Note: Vertica assumes the values in the merge join column are unique. If more than
one matching value is present in either the target or source table's join column, the
MERGE statement is liable to return with a run-time error. See Optimized Versus Non-
optimized MERGE for more information.
For all joined rows, Vertica updates rows in the target table with data from the source
table.
When preparing an optimized query plan for a MERGE statement, Vertica enforces strict
requirements for unique and primary key constraints in the join key (ON clause). If you do
not enforce such constraints, MERGE fails when it finds:

l More than one matching value in target join column for a corresponding value in the
source table when the target join column has a unique or primary key constraint. If the
target join column has no such constraint, the statement runs without error, but it also
runs without optimization.
l More than one matching value in the source join column for a corresponding value in
the target table. The source join column does not require a unique or primary key
constraint.
WHEN NOT MATCHED THEN INSERT
The WHEN NOT MATCHED THEN INSERT clause specifies to insert into the target table
all rows from the source table that do not match target table rows.
The columns you specify in the WHEN NOT MATCHED THEN INSERT clause must be
columns from the target table. The VALUES clause specifies a list of values to store in the
corresponding columns. If you do not supply a column value, do not list that column in
the WHEN NOT MATCHED clause. For examples, source and target tables are defined as
follows:
CREATE TABLE test1 (c1 int, c2 int, c3 int);
CREATE TABLE test2 (val_c1 int, val_c2 int, val_c3 int);
The following WHEN NOT MATCHED clause excludes column c3 from the WHEN NOT
MATCHED and VALUES clauses:
MERGE INTO test1 USING test2 ON test1.c1=test2.val_c1
WHEN NOT MATCHED THEN INSERT (c1, c2) values (test2.val_c1, test2.val_c2);
Vertica inserts a null value into test1.c3.
You cannot qualify table name or alias with the columns; for example, the following is
not allowed:
WHEN NOT MATCHED THEN INSERT source.x
If column names are not listed, MERGE behaves like INSERT..SELECT by assuming that
the columns are in the exact same table definition order.
Optimized Versus Non-Optimized MERGE
By default, Vertica prepares an optimized query plan to improve merge performance
when the MERGE statement and its tables meet certain criteria. If the criteria are not met,

MERGE could run without optimization or return a run-time error. This section describes
scenarios for both optimized and non-optimized MERGE.
Conditions for an Optimized MERGE
Vertica prepares an optimized query plan when all of the following conditions are true:
l The target table's join column has a unique or primary key constraint
l UPDATE and INSERT clauses include every column in the target table
l UPDATE and INSERT clause column attributes are identical
Note: The source table's join column does not require a unique or primary key
constraint to be eligible for an optimized query plan. Also, the source table can
contain more columns than the target table, as long as the UPDATE and INSERT
clauses use the same columns and the column attributes are the same.
How to determine if a MERGE statement is eligible for optimization
To determine whether a MERGE statement is eligible for optimization, prefix MERGE with
the EXPLAIN keyword and examine the plan's textual output. (See MERGE Path for
examples.) A a Semi path indicates the statement is eligible for optimization, whereas a
Right Outer path indicates the statement is ineligible and will run with the same
performance as MERGE in previous releases unless a duplicate merge join key is
encountered at query run time.
About duplicate matching values in the join column
Even if the MERGE statement and its tables meet the required criteria for optimization,
MERGE could fail with a run-time error if there are duplicate values in the join column.
When Vertica prepares an optimized query plan for a merge operation, it enforces strict
requirements for unique and primary key constraints in the MERGE statement's join
columns. If you haven't enforced constraints, MERGE fails under the following scenarios:
l Duplicates in the source table. If Vertica finds more than one matching value in the
source join column for a corresponding value in the target table, MERGE fails with a
run-time error.
l Duplicates in the target table. If Vertica finds more than one matching value in
target join column for a corresponding value in the source table, and the target join
column has a unique or primary key constraint, MERGE fails with a run-time error. If the

target join column has no such constraint, the statement runs without error and
without optimization.
Be aware that if you run MERGE multiple times using the same target and source table,
each statement run has the potential to introduce duplicate values into the join columns,
such as if you use constants in the UPDATE/INSERT clauses. These duplicates could
cause a run-time error the next time you run MERGE.
To avoid duplicate key errors, be sure to enforce constraints you declare to assure
unique values in the merge join column.
Examples
The examples that follow use a simple schema to illustrate some of the conditions under
which Vertica prepares or does not prepare an optimized query plan for MERGE:
CREATE TABLE target(a INT PRIMARY KEY, b INT, c INT) ORDER BY b,a;
CREATE TABLE source(a INT, b INT, c INT) ORDER BY b,a;
INSERT INTO target VALUES(1,2,3);
INSERT INTO target VALUES(2,4,7);
INSERT INTO source VALUES(3,4,5);
INSERT INTO source VALUES(4,6,9);
COMMIT;
Example of an optimized MERGE statement
Vertica can prepare an optimized query plan for the following MERGE statement because:
l The target table's join column (ON t.a=s.a) has a primary key constraint
l All columns in the target table (a,b,c) are included in the UPDATE and INSERT
clauses
l Columns attributes specified in the UPDATE and INSERT clauses are identical
MERGE INTO target t USING source s ON t.a = s.a
WHEN MATCHED THEN UPDATE SET a=s.a, b=s.b, c=s.c
WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a,s.b,s.c);
OUTPUT
--------
2
(1 row)
The output value of 2 indicates success and denotes the number of rows
updated/inserted from the source into the target.
Example of a non-optimized MERGE statement

In the next example, the MERGE statement runs without optimization because the column
attributes in the UPDATE/INSERT clauses are not identical. Specifically, the UPDATE
clause includes constants for columns s.a and s.c and the INSERT clause does not:
WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c - 1
WHEN NOT MATCHED THEN INSERT(a,b,c) VALUES(s.a,s.b,s.c);
To make the previous MERGE statement eligible for optimization, rewrite the statement as
follows, so the attributes in the UPDATE and INSERT clauses are identical:
WHEN MATCHED THEN UPDATE SET a=s.a + 1, b=s.b, c=s.c -1
WHEN NOT MATCHED THEN INSERT(a,b,c)
VALUES(s.a + 1, s.b, s.c - 1);
MERGE Restrictions
This page documents several restrictions that pertain to updating and inserting table
data with MERGE.
Duplicate Values in the Merge Join Key
Vertica assumes that the data to merge conforms with constraints you declare. To avoid
duplicate key errors, be sure to enforce declared constraints to assure unique values in
the merge join column. If the MERGE statement fails with a duplicate key error, you must
correct your data.
Also, be aware that if you run MERGE multiple times with the same target and source
tables, you might introduce duplicate values into the join columns, such as if you use
constants in the UPDATE/INSERT clauses. These duplicates can cause a run-time error.
Tables with Sequences
If the tables to merge include sequences, these must be omitted from the MERGE
statement. This restriction also applies to implied references to a sequence. For
example, if a column uses a sequence as its default value, that column cannot be
included in the MERGE statement.
The following example merges table customer1 into customer2, where column id in
customer2 gets its default value from sequence my_seq:
=> create sequence my_seq START 100;
CREATE SEQUENCE
lname VARCHAR(25),
fname VARCHAR(25),

membership_card INTEGER,
id INTEGER);
WARNING 6978: Table "customer" will include privileges from schema "public"
CREATE TABLE
=> INSERT INTO customer1 VALUES ('Hawkins' ,'John', 072753, NEXTVAL('my_seq'));
OUTPUT
--------
1
(1 row)
id INTEGER DEFAULT NEXTVAL('my_seq'),
lname VARCHAR(25),
fname VARCHAR(25),
membership_card INTEGER);
WARNING 6978: Table "customer2" will include privileges from schema "public"
CREATE TABLE
=> INSERT INTO customer2 VALUES (default,'Carr', 'Mary', 87432);
OUTPUT
--------
1
(1 row)
When you try to merge data from customer1 into customer2, Vertica returns with an
error:
=> MERGE INTO customer2 c2 USING customer1 c1 ON c2.fname=c1.fname AND c2.lname=c1.lname
WHEN NOT MATCHED THEN INSERT (lname, fname, membership_card) values (c1.lname, c1.fname,
c1.membership_card);
ERROR 4711: Sequence or IDENTITY/AUTO_INCREMENT column in merge query is not supported
Other Restrictions
l You cannot use MERGE with unstructured tables.
l You cannot use the LIMIT clause on unsorted data when you update or merge tables
that participate in pre-join projections.
l If any PRIMARY KEY or UNIQUE constraints are enabled for automatic enforcement,
Vertica enforces those constraints when you insert values into a table. If a violation
occurs, Vertica rolls back the SQL statement and returns an error identifying the
constraint that was violated.
l You cannot run MERGE on identity/auto-increment columns or on columns that have
primary key or foreign key referential integrity constraints, as defined in CREATE
TABLEColumn-Constraint syntax

MERGE Example
In this example, the merge operation involves two tables:
l weekly_traffic logs restaurant traffic in real time, and is updated with each
customer visit. Data in this table is refreshed once a week.
l traffic_history stores the history of customer visits to various restaurants,
accumulated over an indefinite time span.
Once a week, you merge the weekly visit count from weekly_traffic into traffic_
history. The merge operation includes two operations:
l Updates existing customer records.
l Inserts new records of first-time customers.
One MERGE statement executes both operations as a single (upsert) transaction.
Source and Target Tables
The source and target tables weekly_traffic and traffic_history define the same
columns:
=> CREATE TABLE traffic_history (
customer_id INTEGER,
location_x FLOAT,
location_y FLOAT,
location_count INTEGER,
location_name VARCHAR2(20));
=> CREATE TABLE weekly_traffic (
customer_id INTEGER,
location_x FLOAT,
location_y FLOAT,
location_count INTEGER,
location_name VARCHAR2(20));
The table traffic_history already contains three records of two customers who
visited different restaurants, Etoile, and LaRosa:
=> SELECT * FROM traffic_history; customer_id | location_x | location_y | location_count |
location_name
-------------+------------+------------+----------------+---------------
1001 | 10.1 | 2.7 | 1 | Etoile
1001 | 4.1 | 7.7 | 1 | LaRosa
1002 | 4.1 | 7.7 | 1 | LaRosa
(3 rows)

Source Table Updates
The following procedure inserts three new records into the source table weekly_
traffic:
1. Client 1001 visited Etoile a second time:
=> INSERT INTO weekly_traffic VALUES (1001, 10.1, 2.7, 1, 'Etoile');
2. Customer 1002 visited a new location, Lux Cafe:
=> INSERT INTO weekly_traffic VALUES (1002, 5.1, 7.9, 1, 'Lux Cafe');
3. A new customer (1003) visited LaRosa:
=> INSERT INTO weekly_traffic VALUES (1003, 4.1, 7.7, 1, 'LaRosa');
After committing the transaction, you can view the updated contents of weekly_
traffic:
=> COMMIT;
=> SELECT * FROM weekly_traffic;
customer_id | location_x | location_y | location_count | location_name
-------------+------------+------------+----------------+---------------
1001 | 10.1 | 2.7 | 1 | Etoile
1002 | 5.1 | 7.9 | 1 | Lux Cafe
1003 | 4.1 | 7.7 | 1 | LaRosa
(3 rows)
Table Data Merge
The following MERGE statement merges traffic_history data into traffic_
history:
l For matching customers, MERGE updates the occurrence count.
l For non-matching customers, MERGE inserts new records.
=> MERGE INTO traffic_history l USING weekly_traffic n
ON (l.customer_id=n.customer_id AND l.location_x=n.location_x AND l.location_y=n.location_y)
WHEN MATCHED THEN UPDATE SET location_count = l.location_count + n.location_count
WHEN NOT MATCHED THEN INSERT (customer_id, location_x, location_y, location_count, location_
name)
VALUES (n.customer_id, n.location_x, n.location_y, n.location_count, n.location_name);
OUTPUT
--------
3
(1 row)

=> COMMIT;
MERGE returns the number of rows updated and inserted. In this case, the returned value
specifies three updates and inserts:
l Customer 1001's second visit to Etoile
l Customer 1001's first visit to new restaurant Lux Cafe
l New customer 1003's visit to LaRosa
If you query the target table traffic_history, you can see the merged (updated and
inserted) results. Updated and new rows are highlighted:
=> SELECT * FROM traffic_history ORDER BY customer_id;
customer_id | location_x | location_y | location_count | location_name
-------------+------------+------------+----------------+---------------
1001 | 4.1 | 7.7 | 1 | LaRosa
1001 | 10.1 | 2.7 | 2 | Etoile
1002 | 4.1 | 7.7 | 1 | LaRosa
1002 | 5.1 | 7.9 | 1 | Lux Cafe
1003 | 4.1 | 7.7 | 1 | LaRosa
(5 rows)
Dropping Tables
Dropping a table removes its definition from the Vertica database. For the syntax details
of this statement, see DROP TABLE in the SQL Reference Manual.
To drop a table, use the following statement:
=> DROP TABLE IF EXISTS mytable;
DROP TABLE
=> DROP TABLE IF EXISTS mytable; -- Table doesn't exist
NOTICE: Nothing was dropped
DROP TABLE
If the table you specify has dependent objects such as projections, you cannot drop the
table. You must use the CASCADE keyword. If you use the CASCADE keyword,
Vertica drops the table and all its dependent objects. These objects can include
superprojections, live aggregate projections, and projections with expressions.
You cannot use the CASCADE option to drop an external table. Because an external
table is read only, you cannot remove any of its associated files.

Truncating Tables
TRUNCATE TABLE removes all storage associated with the target table and its
projections. Vertica preserves the table and the projection definitions. If the truncated
table has out-of-date projections, those projections are cleared and marked up-to-date
when TRUNCATE TABLE returns.
TRUNCATE TABLE takes an O (owner) lock on the table until the truncation process
completes. The savepoint is then released.
TRUNCATE TABLE commits the entire transaction after statement execution, even if
truncating the table fails. You cannot roll back a TRUNCATE TABLE statement.
Use TRUNCATE TABLE for testing purposes. With this table, you can remove all table
data without having to recreate projections when you reload table data.
In some cases, you might use create truncated table from a large single (fact) table
containing pre-join projections. When you do so, the projections show zero (0) rows
after the transaction completes and the table is ready for data reload.
Restrictions
l You cannot truncate an external table.
l If the table to truncate is a dimension table with pre-join projections, you cannot
truncate it. Drop the pre-join projections before executing TRUNCATE TABLE.
Working with Projections
Projections are physical storage for table data. A projection can contain some or all of
the columns from one or more tables.
This section covers the following topics:
l Projection Types
l K-Safe Database Projections
l Updating Projections Using Refresh
l Monitoring Projection Refresh on Buddy Projections

l Creating Unsegmented Projections
l Dropping Projections
Projection Types
You can create several types of projections. These summaries describe each option.
Superprojections
A superprojection contains all the columns of a table. For each table in the database,
Vertica requires a minimum of one projection, which is the superprojection. To get your
database up and running quickly, when you load or insert data into an existing table for
the first time, Vertica automatically creates a superprojection.
Query-Specific Projections
A query-specific projection is a projection that contains only the subset of table columns
to process a given query. Query-specific projections significantly improve the
performance of those queries for which they are optimized.
Pre-Join Projections
A pre-join projection contains inner joins between tables that are connected by primary
key or foreign key constraints. Pre-join projections provide a significant performance
advantage over joining tables at query run time. In addition, using a pre-join projection
allows you to define sort orders for queries that you execute frequently.
Aggregate Projections
Queries that include expressions or aggregate functions such as SUM and COUNT can
perform more efficiently when they use projections that already contain the aggregated
data. This is especially true for queries on large quantities of data.
Vertica provides three types of projections for storing data that is returned from
aggregate functions or expressions:
l Projection that contains expressions: Projection with columns whose values are
calculated from anchor table columns.

l Live aggregate projection: Projection that contains columns with values that are
aggregated from columns in its anchor table. You can also define live aggregate
projections that include user-defined transform functions.
l Top-K projection: Type of live aggregate projection that returns the top k rows from a
partition of selected rows. Create a Top-K projection that satisfies the criteria for a
Top-K query.
For more information, see Pre-Aggregating Data in Projections.
Creating Unsegmented Projections
If a CREATE PROJECTION statement omits a hash segmentation clause and
UNSEGMENTED, Vertica creates an unsegmented projection as follows:
l If projection K-safety is set to 0, Vertica creates the projection on the initiator node.
l If projection K-safety is greater than 0, Vertica creates the projection on all nodes.
You can explicitly specify that a projection be unsegmented with the UNSEGMENTED
option. UNSEGMENTED must be qualified with one of the following keywords:
l NODE node: Creates an unsegmented projection only on the specified node node. To
get cluster node names, query the NODES table:
SELECT node-name FROM NODES;
l ALL NODES: Creates identical instances of the unsegmented projection on all cluster
nodes. This option enables distributed query execution for tables too small to benefit
from segmentation. You must set this option if projection or system K-safety is greater
than 0, otherwise Vertica returns an error.
CREATE PROJECTION

Projection Naming
Vertica identifies projections according to the following conventions, where
proj-basename is the name assigned to this projection by CREATE PROJECTION.
Unsegmented Projections
Unsegmented projections conform to the following naming conventions:
table-name_
super
Identifies the auto projection that Vertica automatically creates
when you initially load data into an unsegmented table, where
table-basename is the table name specified in CREATE TABLE.
The auto projection is always a super projection.
For example:
store.customer_dimension_super
proj-basename
[_unseg]
Identifies an unsegmented projection. If proj-basename is
identical to the anchor table name, Vertica appends the string _
unseg to the projection name. If the projection is copied on all
nodes, this projection name maps to all instances.
For example:
store.customer_dimension_unseg
Segmented projections conform to the following naming convention:
proj-basename
_boffset
Identifies buddy projections for a segmented projection, where
offset identifies the projection's node location relative to all other
buddy projections. All buddy projections share the same project
base name.
For example:
store.store_orders_fact_b0

store.store_orders_fact_b1
One exception applies: Vertica uses the following convention to
name live aggregate projections:
l proj-basename
l proj-basename_b1
l ...
K-Safe Database Projections
You can set K-safety on individual projections through the CREATE PROJECTION option
KSAFE. Projection K-safety must be equal to or greater than database K-safety. If you
omit setting KSAFE, the projection obtains K-safety from the database.
K-safety is implemented differently for segmented and unsegmented projections, as
described below. The examples assume database K-safety is set to 1 in a 3-node
database, and uses projections for two tables:
l store.store_orders_fact is a large fact table. The projection for this table should
be segmented. Vertica distributes projection segments uniformly across the cluster.
l store.store_dimension is a smaller dimension table. The projection for this table
should be unsegmented. Vertica copies a complete instance of this projection on
each cluster node.
If database K-safety is set to 1, the database requires two instances, or buddies, of each
projection segment. The following CREATE PROJECTION defines a segmented
projection for the fact table store.store_orders_fact:
VMart=> CREATE PROJECTION store.store_orders_fact
(prodkey, ordernum, storekey, total)
AS SELECT product_key, order_number, store_key, quantity_ordered*unit_price
FROM store.store_orders_fact
SEGMENTED BY HASH(product_key, order_number) ALL NODES KSAFE 1;
CREATE PROJECTION
Three keywords in the CREATE PROJECTION statement pertain to setting projection K-
safety:

K-SAFE 1 Sets K-safety to 1. Vertica automatically creates two instances of the
projection using the following naming convention:
projection-name_bn
where n is a value between 0 and k. Because K-safety is set to 1,
Vertica creates two projections:
l store.store_orders_fact_b0
l store.store_orders_fact_b1
ALL NODES Specifies to segment projections across all cluster nodes.
HASH Helps ensure uniform distribution of segmented projection data across
the cluster.
Unsegmented Projections
In a K-safe database, you create an unsegmented projection with a CREATE
PROJECTION statement that includes the keywords UNSEGMENTED ALL NODES. These
keywords specify to create identical instances (buddies) of the entire projection on all
cluster nodes.
CREATE PROJECTION
Updating Projections Using Refresh
CREATE PROJECTION does not load data into physical storage. If the anchor tables
already contain data, run START_REFRESH to update the projection. Depending on how
much data is in the tables, updating a projection can be time consuming. When a
projection is up-to-date, however, it is updated automatically as part of COPY, DELETE,
INSERT, MERGE, or UPDATE statements.

Monitoring Projection Refresh on Buddy Projections
You cannot refresh a projection until after you create a buddy projection. After you run
CREATE PROJECTION, if you run SELECT START_REFRESH(), you see the following
message:
Starting refresh background process
However, the refresh does not begin until after you create buddy projection . To monitor
the refresh operation, review the vertica.log file. You can also run GET_
PROJECTIONS to view the final status of the projection refresh:
=> SELECT GET_PROJECTIONS('customer_dimension');
GET_PROJECTIONS
----------------------------------------------------------------
Current system K is 1.
# of Nodes: 3.
Table public.customer_dimension has 2 projections.
Projection Name: [Segmented] [Seg Cols] [# of Buddies] [Buddy Projections] [Safe] [UptoDate]
[Stats]
---------------------------------------------------------------------------------------------------
-
public.customer_dimension_unseg [Segmented: No] [Seg Cols: ] [K: 2] [public.customer_dimension_
unseg] [Safe: Yes] [UptoDate: Yes] [Stats: Yes]
public.customer_dimension_DBD_1_rep_VMartDesign_node0001 [Segmented: No] [Seg Cols: ] [K: 2]
[public.customer_dimension_DBD_1_rep_VMartDesign_node0001] [Safe: Yes] [UptoDate: Yes] [Stats: Yes]
(1 row)
Dropping Projections
Projections can be dropped explicitly through the DROP PROJECTION statement. They
are also implicitly dropped when you drop their anchor table.

Using Table Partitions
Vertica supports data partitioning at the table level, which divides one large table into
smaller pieces. Partitions are a table property that apply to all projections for a given
table.
For example, it is common to partition data by time slices. For instance, if a table
contains decades of data, you can partition it by year. If the table contains only a year of
data, it makes sense to partition it by month.
Partitions can improve parallelism during query execution and enable some other
optimizations. Partitions segregate data on each node to facilitate dropping partitions.
You can drop older data partitions to make room for newer data.
Tip: When a storage container has data for a single partition, you can discard that
storage location (DROP_LOCATION) after dropping the partition with the function
DROP_PARTITION.
Defining Partitions
The first step in defining data partitions is to establish the relationship between the data
and partitions. To illustrate, consider the following table called trade, which contains
unpartitioned data for the trade date (tdate), ticker symbol (tsymbol), and time (ttime).
Table 1: Unpartitioned data
tdate | tsymbol | ttime
------------+---------+----------
2008-01-02 | AAA | 13:00:00
2009-02-04 | BBB | 14:30:00
2010-09-18 | AAA | 09:55:00
2009-05-06 | AAA | 11:14:30
2008-12-22 | BBB | 15:30:00
(5 rows)
If you want to discard data once a year, a logical choice is to partition the table by year.
The partition expression PARTITION BY EXTRACT(year FROM tdate)creates the
partitions shown in Table 2:
Table 2: Data partitioned by year
2008 2009 2010
tdate tsymbol ttime tdate tsymbol ttime tdate tsymbol ttime

---------+---------+---------
01/02/08 | AAA | 13:00:00
12/22/08 | BBB | 15:30:00
---------+---------+---------
02/04/09 | BBB | 14:30:00
05/06/09 | AAA | 11:14:30
---------+---------+---------
09/18/10 | AAA | 09:55:00
Unlike some databases, which require you to explicitly define partition boundaries in the
CREATE TABLE statement, Vertica selects a partition for each row based on the result
of a partitioning expression provided in the CREATE TABLE statement. Partitions do
not have explicit names associated with them. Internally, Vertica creates a partition for
each distinct value in the PARTITION BY expression.
After you specify a partition expression, Vertica processes the data by applying the
partition expression to each row and then assigning partitions.
The following syntax generates the partitions for this example, with the results shown in
Table 3. It creates a table called trade, partitioned by year. For additional information,
see CREATE TABLE in the SQL Reference Manual.
CREATE TABLE trade (
tdate DATE NOT NULL,
tsymbol VARCHAR(8) NOT NULL,
ttime TIME)
PARTITION BY EXTRACT (year FROM tdate);
CREATE PROJECTION trade_p (tdate, tsymbol, ttime) AS
SELECT * FROM trade
ORDER BY tdate, tsymbol, ttime UNSEGMENTED ALL NODES;
INSERT INTO trade VALUES ('01/02/08' , 'AAA' , '13:00:00');
INSERT INTO trade VALUES ('02/04/09' , 'BBB' , '14:30:00');
INSERT INTO trade VALUES ('12/22/08' , 'BBB' , '15:30:00');
Table 3: Partitioning expression and results
Partitioning By Year and Month
To partition by both year and month, you need a partition expression that pads the
month out to two digits so the partition keys appear as:
201101
201102

201103
...
201111
201112
You can use the following partition expression to partition the table using the year and
month:
PARTITION BY EXTRACT(year FROM tdate)*100 + EXTRACT(month FROM tdate)
Restrictions on Partitioning Expressions
l The partitioning expression can reference one or more columns from the table.
l The partitioning expression cannot evaluate to NULL for any row, so do not include
columns that allow a NULL value in the CREATE TABLE..PARTITION BY
expression.
l Any SQL functions in the partitioning expression must be immutable, meaning that
they return the exact same value regardless of when they are invoked, and
independently of session or environment settings, such as LOCALE. For example,
you cannot use the TO_CHAR function in a partition expression, because it depends
on locale settings, or the RANDOM function, since it produces different values at
each invocation.
l Vertica meta-functions cannot be used in partitioning expressions.
l All projections anchored on a table must include all columns referenced in the
PARTITION BY expression; this allows the partition to be calculated.
Best Practices for Partitioning
n While Vertica supports a maximum of 1024 partitions, few, if any, organizations will
need to approach that maximum. Fewer partitions are likely to meet your business
needs, while also ensuring maximum performance. Many customers, for example,
partition their data by month, bringing their partition count to 12. Vertica
recommends you keep the number of partitions between 10 and 20 to achieve
excellent performance.

l Do not apply partitioning to tables used as dimension tables in pre-join projections.
You can apply partitioning to tables used as large single (fact) tables in pre-join
projections.
l For maximum performance, do not partition projections on LONG VARBINARY and
LONG VARCHAR columns.
Partitioning versus Segmentation
In Vertica, partitioning and segmentation are separate concepts and achieve different
goals to localize data:
l Segmentation refers to organizing and distributing data across cluster nodes for fast
data purges and query performance. Segmentation aims to distribute data evenly
across multiple database nodes so all nodes participate in query execution. You
specify segmentation with the CREATE PROJECTION statement's hash segmentation
clause.
l Partitioning specifies how to organize data within individual nodes for distributed
computing. Node partitions let you easily identify data you wish to drop and help
reclaim disk space. You specify partitioning with the CREATE TABLE statement's
PARTITION BY clause.
For example: partitioning data by year makes sense for retaining and dropping annual
data. However, segmenting the same data by year would be inefficient, because the
node holding data for the current year would likely answer far more queries than the
other nodes.
The following diagram illustrates the flow of segmentation and partitioning on a four-
node database cluster:
1. Example table data
2. Data segmented by HASH(order_id)
3. Data segmented by hash across four nodes
4. Data partitioned by year on a single node
While partitioning occurs on all four nodes, the illustration shows partitioned data on one
node for simplicity.

See Also
l Reclaiming Disk Space From Deleted Records
l Identical Segmentation
l Projection Segmentation
l CREATE PROJECTION
l CREATE TABLE
Partitioning and Data Storage
Partitions and ROS Containers
l Data is automatically split into partitions during load / refresh / recovery operations.
l The Tuple Mover maintains physical separation of partitions.
l Each ROS container contains data for a single partition, multiple ROS containers can
be used for a single partition.

Partition Pruning
When a query predicate includes one or more columns in the partitioning clause,
queries look only at relevant ROS containers. See Partition Elimination for details.
Managing Partitions
Vertica provides various options to let you manage and monitor the partitions you
create.
PARTITION_TABLE Function
The function PARTITION_TABLE physically separates partitions into separate
containers. Only ROS containers with more than one distinct value participate in the
split.
The following example creates a simple table states and partitions the data by state:
=> CREATE TABLE states (year INTEGER NOT NULL,
state VARCHAR NOT NULL)
PARTITION BY state;
=> CREATE PROJECTION states_p (state, year) AS
SELECT * FROM states
ORDER BY state, year UNSEGMENTED ALL NODES;
Run PARTITION_TABLE to partition the table states:
=> SELECT PARTITION_TABLE('states');
PARTITION_TABLE
---------------------------------------------------------------------------------
Task: partition operation
(Table: public.states) (Projection: public.states_p)
(1 row)
PARTITIONS System Table
You can display partition metadata, one row per partition key, per ROS container, by
querying the PARTITIONS system table.
Given the unsegmented projection states_p replicated across three nodes, the
following query on the PARTITIONS table returns twelve rows, representing twelve
ROS containers:
=> SELECT PARTITION_KEY, ROS_ID, ROS_SIZE_BYTES, ROS_ROW_COUNT, NODE_NAME FROM partitions WHERE
PROJECTION_NAME='states_p' order by ROS_ID;
PARTITION_KEY | ROS_ID | ROS_SIZE_BYTES | ROS_ROW_COUNT | NODE_NAME
---------------+-------------------+----------------+---------------+------------------
VT | 45035996281231297 | 95 | 14 | v_vmart_node0001
PA | 45035996281231309 | 92 | 11 | v_vmart_node0001
NY | 45035996281231321 | 90 | 9 | v_vmart_node0001

MA | 45035996281231333 | 96 | 15 | v_vmart_node0001
VT | 49539595902704977 | 95 | 14 | v_vmart_node0002
PA | 49539595902704989 | 92 | 11 | v_vmart_node0002
NY | 49539595902705001 | 90 | 9 | v_vmart_node0002
MA | 49539595902705013 | 96 | 15 | v_vmart_node0002
VT | 54043195530075651 | 95 | 14 | v_vmart_node0003
PA | 54043195530075663 | 92 | 11 | v_vmart_node0003
NY | 54043195530075675 | 90 | 9 | v_vmart_node0003
MA | 54043195530075687 | 96 | 15 | v_vmart_node0003
(12 rows)
General Guidelines
To prevent too many ROS containers, be aware that delete operations must open all the
containers. For optimial performance, create fewer than 20 partitions; avoid creating
more than 50.
Restrictions
l You cannot use non-deterministic functions in a PARTITION BY expression. For
example, the value of TIMESTAMPTZ depends on user settings.
l A dimension table in a pre-join projection cannot be partitioned.
Partitioning, Repartitioning, and Reorganizing Tables
Using the ALTER TABLE statement with its PARTITION BY syntax and the optional
REORGANIZE keyword partitions or re-partitions a table according to the partition-clause
that you define in the statement. Vertica immediately drops any existing partition keys
when you execute the statement.
You can use the PARTITION BY and REORGANIZE keywords separately or together.
However, you cannot use these keywords with any other ALTER TABLE clauses.
The following example adds partitioning to the Sales table based on state and
reorganizes the data into partitions:
=> ALTER TABLE Sales PARTITION BY Month REORGANIZE;
Privileges
Partitioning or re-partitioning tables requires USAGE privilege on the schema that
contains the table.
PARTITION BY Expressions
PARTITION BY expressions can specify leaf expressions, functions, and operators.
The following requirements and restrictions apply to PARTITION BY expressions:

l They must calculate a single non-null value for each row. The expression can
reference multiple columns, but each row must return a single value.
l All leaf expressions must be constants or table columns.
l SQL functions must be immutable.
l Aggregate functions and queries are not supported.
Reorganizing Data After Partitioning
Partitioning is not complete until you reorganize the data. The optional REORGANIZE
keyword completes table partitioning by assigning partition keys. You can use
REORGANIZE with PARTITION BY, or as the only keyword in the ALTER TABLE
statement for tables that were previously altered with the PARTITION BY modifier, but
were not reorganized with the REORGANIZE keyword.
If you specify the REORGANIZE keyword, data is partitioned immediately to the new
schema as a background task.
Tip: As a best practice, HPE recommends that you reorganize the data while
partitioning the table, using PARTITION BY with the REORGANIZE keyword. If you do
not specify REORGANIZE, performance for queries, DROP_PARTITION()
operations, and node recovery could be degraded until the data is reorganized.
Also, without reorganizing existing data, new data is stored according to the new
partition expression, while the existing data storage remains unchanged.
Monitoring Reorganization
When you use the ALTER TABLE .. REORGANIZE, the operation reorganizes the data
in the background.
You can monitor details of the reorganization process by polling the following system
tables:
l V_MONITOR.PARTITION_STATUS displays the fraction of each table that is
partitioned correctly.
l V_MONITOR.PARTITION_REORGANIZE_ERRORS logs any errors issued by the
background REORGANIZE process.

l V_MONITOR.PARTITIONS displays NULLS in the partition_key column for any
ROS's that have not been reorganized.
Note: The corresponding foreground process to ALTER TABLE ... REORGANIZE is
PARTITION_TABLE().
Auto Partitioning
Vertica attempts to keep data from each partition stored separately. Auto partitioning
occurs when data is written to disk, such as during COPY DIRECT or moveout
operations.
Separate storage provides two benefits: Partitions can be dropped quickly, and partition
elimination can omit storage that does not needto participate in a query plan.
Note: If you use INSERT...SELECT in a partitioned table, Vertica sorts the data
before writing it to disk, even if the source of the SELECT has the same sort order as
the destination.
Examples
The examples that follow use this simple schema. First, create a table named t1 and
partition the data on the c2 column:
CREATE TABLE t1 (
c1 INT NOT NULL,
c2 INT NOT NULL)
SEGMENTED BY c1 ALL NODES
PARTITION BY c2;
Create two identically segmented buddy projections:
CREATE PROJECTION t1_p AS SELECT * FROM t1 SEGMENTED BY HASH(c1) ALL NODES OFFSET 0;
CREATE PROJECTION t1_p1 AS SELECT * FROM t1 SEGMENTED BY HASH(c1) ALL NODES OFFSET 1;
Insert some data:
INSERT INTO t1 VALUES(10,15);
Query the table to verify the inputs:
SELECT * FROM t1;

c1 | c2
----+----
10 | 15
20 | 25
30 | 35
40 | 45
(4 rows)
Now perform a moveout operation on the projections in the table:
SELECT DO_TM_TASK('moveout','t1');
do_tm_task
--------------------------------
moveout for projection 't1_p1'
moveout for projection 't1_p'
(1 row)
Query the PARTITIONS system table. Notice that the four partition keys reside on two
nodes, each in its own ROS container (see the ros_id column). The PARTITION BY
clause was used on column c2, so Vertica auto partitioned the input values during the
COPY operation:
SELECT partition_key, projection_name, ros_id, ros_size_bytes,
ros_row_count, node_name FROM PARTITIONS
WHERE projection_name like 't1_p1';
partition_key | projection_name | ros_id | ros_size_bytes | ros_row_count | node_
name
---------------+-----------------+-------------------+----------------+---------------+------------
-------
15 | t1_p1 | 49539595901154617 | 78 | 1 | node0002
25 | t1_p1 | 54043195528525081 | 78 | 1 | node0003
35 | t1_p1 | 54043195528525069 | 78 | 1 | node0003
45 | t1_p1 | 49539595901154605 | 79 | 1 | node0002
(4 rows)
Vertica does not auto partition when you refresh with the same sort order. If you create a
new projection, Vertica returns a message telling you to refresh the projections; for
example:
CREATE PROJECTION t1_p2 AS SELECT * FROM t1 SEGMENTED BY HASH(c1) ALL NODES OFFSET 2;
WARNING: Projection <public.t1_p2> is not available for query processing. Execute the select
start_refresh() function to copy data into this projection.
The projection must have a sufficient number of buddy projections and all nodes
must be up before starting a refresh.
Run the START_REFRESH function:
SELECT START_REFRESH();
start_Refresh

----------------------------------------
Starting refresh background process.
(1 row)
Query the PARTITIONS system table again. The partition keys now reside in two ROS
containers, instead of four, which you can tell by looking at the values in the ros_id
column. The ros_row_count column holds the number of rows in the ROS container:
SELECT partition_key, projection_name, ros_id, ros_size_bytes,
ros_row_count, node_name FROM PARTITIONS
WHERE projection_name LIKE't1_p2';
partition_key | projection_name | ros_id | ros_size_bytes | ros_row_count | node_
name
---------------+-----------------+-------------------+----------------+---------------+------------
-------
15 | t1_p2 | 54043195528525121 | 80 | 2 | node0003
25 | t1_p2 | 58546795155895541 | 77 | 2 | node0004
35 | t1_p2 | 58546795155895541 | 77 | 2 | node0004
45 | t1_p2 | 54043195528525121 | 80 | 2 | node0003
(4 rows)
The following command more specifically queries ROS information for the partitioned
tables. In this example, the query counts two ROS containers each on two different
nodes for projection t1_p2:
SELECT ros_id, node_name, COUNT(*) FROM PARTITIONS
WHERE projection_name LIKE 't1_p2'
GROUP BY ros_id, node_name;
ros_id | node_name | COUNT
-------------------+-----------+-------
54043195528525121 | node0003 | 2
58546795155895541 | node0004 | 2
(2 rows)
This command returns a result of four ROS containers on two different nodes for
projection t1_p1:
SELECT ros_id,node_name, COUNT(*) FROM PARTITIONS
WHERE projection_name LIKE 't1_p1'
GROUP BY ros_id, node_name;
ros_id | node_name | COUNT
-------------------+-----------+-------
49539595901154605 | node0002 | 1
49539595901154617 | node0002 | 1
54043195528525069 | node0003 | 1
54043195528525081 | node0003 | 1
(4 rows)

See Also
l DO_TM_TASK
l PARTITIONS
l START_REFRESH
Eliminating Partitions
If the ROS containers of partitioned tables are not needed, Vertica can eliminate the
containers from being processed during query execution. To eliminate ROS containers,
Vertica compares query predicates to partition-related metadata.
Each ROS partition expression column maintains the minimum and maximum values of
data stored in that ROS, and Vertica uses those min/max values to potentially eliminate
ROS containers from query planning. Partitions that cannot contain matching values are
not scanned. For example, if a ROS does not contain data that satisfies a given query
predicate, the optimizer eliminates (prunes) that ROS from the query plan. After non-
participating ROS containers have been eliminated, queries that use partitioned tables
run more quickly.
Note: Partition pruning occurs at query run time and requires a query predicate on
the partitioning column.
Assume a table that is partitioned by year (2007, 2008, 2009) into three ROS containers,
one for each year. Given the following series of commands, the two ROS containers that
contain data for 2007 and 2008 fall outside the boundaries of the requested year (2009)
and get eliminated.
=> CREATE TABLE ... PARTITION BY EXTRACT(year FROM date);
=> SELECT ... WHERE date = '12-2-2009';

Making Past Partitions Eligible for Elimination
The following procedure lets you make past partitions eligible for elimination. The
easiest way to guarantee that all ROS containers are eligible is to:
1. Create a new fact table with the same projections as the existing table.
2. Use INSERT..SELECT to populate the new table.
3. Drop the original table and rename the new table.
If there is not enough disk space for a second copy of the fact table, an alternative is to:
1. Verify that the Tuple Mover has finished all post-upgrade work; for example, when
the following command shows no mergeout activity:
=> SELECT * FROM TUPLE_MOVER_OPERATIONS;
2. Identify which partitions need to be merged to get the ROS minimum/maximum
values by running the following command:
=> SELECT DISTINCT table_schema, projection_name, partition_key
FROM partitions p LEFT OUTER JOIN vs_ros_min_max_values v
ON p.ros_id = v.delid
WHERE v.min_value IS null;
3. Insert a record into each partition that has ineligible ROS containers and commit.
4. Delete each inserted record and commit again.
At this point, the Tuple Mover automatically merges ROS containers from past partitions.
Verifying the ROS Merge
1. Query the TUPLE_MOVER_OPERATIONS table again:
=> SELECT * FROM TUPLE_MOVER_OPERATIONS;
2. Check again for any partitions that need to be merged:
=> SELECT DISTINCT table_schema, projection_name, partition_key
FROM partitions p LEFT OUTER JOIN vs_ros_min_max_values v

ON p.ros_id = v.rosid
WHERE v.min_value IS null;
Examples
Assume a table that is partitioned by time and will use queries that restrict data on time.
CREATE TABLE time (
tdate DATE NOT NULL,
tnum INTEGER)
PARTITION BY EXTRACT(year FROM tdate);
CREATE PROJECTION time_p (tdate, tnum) AS
SELECT * FROM time
ORDER BY tdate, tnum UNSEGMENTED ALL NODES;
Note: Projection sort order has no effect on partition elimination.
INSERT INTO time VALUES ('03/15/04' , 1);
The data inserted in the previous series of commands would be loaded into three ROS
containers, one per year, since that is how the data is partitioned:
SELECT * FROM time ORDER BY tnum;
tdate | tnum
------------+------
2004-03-15 | 1 --ROS1 (min 03/01/04, max 03/15/04)
2005-03-15 | 2 --ROS2 (min 03/15/05, max 03/15/05)
2006-03-15 | 3 --ROS3 (min 03/15/06, max 03/15/06)
2006-03-15 | 4 --ROS3 (min 03/15/06, max 03/15/06)
(4 rows)
Here's what happens when you query the time table:
l In the this query, Vertica can eliminate ROS2 because it is only looking for year 2004:
=> SELECT COUNT(*) FROM time WHERE tdate = '05/07/2004';
l In the next query, Vertica can eliminate both ROS1 and ROS3:
=> SELECT COUNT(*) FROM time WHERE tdate = '10/07/2005';
l The following query has an additional predicate on the tnum column for which no
minimum/maximum values are maintained. In addition, the use of logical operator OR

is not supported, so no ROS elimination occurs:
=> SELECT COUNT(*) FROM time WHERE tdate = '05/07/2004' OR tnum = 7;
Dropping Partitions
Use the DROP_PARTITION function to drop a partition. Normally, this is a fast
operation that discards all ROS containers that contain data for the partition.
Occasionally, a ROS container contains rows that belong to more than one partition. In
general, Vertica segregates data from different partitions in different ROS containers, but
exceptions can occur. For example, in some cases refresh and recovery operations can
generate ROS containers with mixed partitions. See Auto Partitioning.
The number of partitions that contain data is restricted by the number of ROS containers
that can comfortably exist in the system.
In general, if a ROS container has data that belongs to n+1 partitions and you want to
drop a specific partition, the DROP_PARTITION operation:
1. Forces the partition of data into two containers where
n one container holds the data that belongs to the partition that is to be dropped
n another container holds the remaining n partitions
2. Drops the specified partition.
DROP_PARTITION forces a moveout if there is data in the WOS (WOS is not partition
aware).
DROP_PARTITION acquires an exclusive lock on the table to prevent DELETE |
UPDATE | INSERT | COPY statements from affecting the table, as well as any SELECT
statements issued at SERIALIZABLE isolation level.
Restrictions
l Users must have USAGE privilege on schema that contains the table.
l DROP_PARTITION operations cannot be performed on tables with projections that
are not up to date (have not been refreshed).

l DROP_PARTITION fails if you do not set the optional third parameter to true and it
encounters ROS's that do not have partition keys.
Examples
Using the example schema in Defining Partitions, the following command explicitly
drops the 2009 partition key from table trade:
SELECT DROP_PARTITION('trade', 2009);
DROP_PARTITION
-------------------
Partition dropped
(1 row)
Here, the partition key is specified:
SELECT DROP_PARTITION('trade', EXTRACT('year' FROM '2009-01-01'::date));
DROP_PARTITION
-------------------
Partition dropped
(1 row)
The following example creates a table called dates and partitions the table by year:
CREATE TABLE dates (year INTEGER NOT NULL,
month VARCHAR(8) NOT NULL)
PARTITION BY year * 12 + month;
The following statement drops the partition using a constant for Oct 2010 (2010*12 + 10
= 24130):
SELECT DROP_PARTITION('dates', '24130');
DROP_PARTITION
-------------------
Partition dropped
(1 row)
Alternatively, the expression can be placed in line: SELECT DROP_PARTITION
('dates', 2010*12 + 10);
The following command first reorganizes the data if it is unpartitioned and then explicitly
drops the 2009 partition key from table trade:
SELECT DROP_PARTITION('trade', 2009, false, true);
DROP_PARTITION
-------------------
Partition dropped
(1 row)

See Also
DROP_PARTITION
Archiving Partitions
You can move partitions from one table to another with the Vertica function MOVE_
PARTITIONS_TO_TABLE. This function is useful for archiving old partitions, as part of the
following procedure:
1. Identify the partitions to archive, and move them to a temporary staging table with
MOVE_PARTITIONS_TO_TABLE.
2. Back up the staging table.
3. Drop the staging table.
You can retrieve and restore archived partitions at any time, as described in Restoring
Archived Partitions.
For general information about moving partitions, see MOVE_PARTITIONS_TO_TABLE.
Move Partitions to Staging Tables
You archive historical data by identifying the partitions you wish to remove from a table.
You then move each partition (or group of partitions) to a temporary staging table.
Before calling MOVE_PARTITIONS_TO_TABLE, you must:
l Drop any pre-join projections associated with the source table.
l Refresh all out-of-date projections.
The following recommendations apply to staging tables:
l To facilitate the backup process, create a unique schema for the staging table of each
archiving operation.
l Specify new names for staging tables. This ensures that they do not contain partitions
from previous move operations.
If the table does not exist, MOVE_PARTITIONS_TO_TABLE creates a table from the
source table's definition, by calling CREATE TABLE with LIKE and INCLUDING

PROJECTIONS clause. The new table inherits ownership from the source table. For
detailed information about attributes that are copied from source tables to new
staging tables, see Creating a Table Like Another.
l Use staging names that enable other users to easily identify partition contents. For
example, if a table is partitioned by dates, use a name that specifies a date or date
range.
In the following example, MOVE_PARTITIONS_TO_TABLE specifies to move a single
partition to the staging table partn_backup.tradfes_200801.
=> SELECT MOVE_PARTITIONS_TO_TABLE (
'prod_trades',
'200801',
'200801',
'partn_backup.trades_200801');
MOVE_PARTITIONS_TO_TABLE
-------------------------------------------------
1 distinct partition values moved at epoch 15.
(1 row)
Back Up the Staging Table
After you create a staging table, you archive it through an object-level backup using a
vbr configuration file. For detailed information, see Backing Up and Restoring the
Database.
Important: Vertica recommends performing a full database backup before the
object-level backup, as a precaution against data loss. You can only restore object-
level backups to the original database.
Drop the Staging Tables
After the backup is complete, you can drop the staging table as described in Dropping
Tables.
See Also
MOVE_PARTITIONS_TO_TABLE
Swapping Partitions
SWAP_PARTITIONS_BETWEEN_TABLES combines the operations of DROP_PARTITION
and MOVE_PARTITIONS_TO_TABLE as a single transaction. SWAP_PARTITIONS_
BETWEEN_TABLES is useful if you regularly load partitioned data from one table into
another and need to refresh partitions in the second table.

For example, you might have a table of revenue that is partitioned by date, and you
routinely move data into it from a staging table. Occasionally, the staging table contains
data for dates that are already in the target table. In this case, you must first remove
partitions from the target table for those dates, then replace them with the corresponding
partitions from the staging table. You can accomplish both tasks with a single call to
SWAP_PARTITIONS_BETWEEN_TABLES.
By wrapping the drop and move operations within a single transaction, SWAP_
PARTITIONS_BETWEEN_TABLES maintains integrity of the swapped data. If any task in
the swap operation fails, the entire operation fails and is rolled back.
Requirements and Restrictions
See SWAP_PARTITIONS_BETWEEN_TABLES.
Examples
In the following example, SWAP_PARTITIONS_BETWEEN_TABLES drops from table
member_info all partitions in range specified by partition keys 2008 and 2009. It
replaces the dropped partitions with the corresponding partitions in source table
customer_info:
=> SELECT SWAP_PARTITIONS_BETWEEN_TABLES('customer_info',2008,2009,'member_info');
SWAP_PARTITIONS_BETWEEN_TABLES
-----------------------------------------------------------------------------------
1 partition values from table customer_info and 2 partition values from table
member_info are swapped at epoch 1250.
See Also
Tutorial for Swapping Partitions
Tutorial for Swapping Partitions
The following example shows how to create two partitioned tables and then swap
certain partitions between the tables.
Both tables have the same definition and have partitions for various year values. You
swap the partitions where year = 2008 and year = 2009. Both tables have at least two
rows that will be swapped.

1. Create the customer_info table:
=> CREATE TABLE customer_info (
customer_id INT PRIMARY KEY NOT NULL,
first_name VARCHAR(25),
last_name VARCHAR(35),
city VARCHAR(25),
year INT NOT NULL)
ORDER BY last_name
PARTITION BY year;
2. Insert data into the customer_info table:
=> INSERT INTO customer_info VALUES (1, 'Joe', 'Smith', 'Denver', 2008);
=> INSERT INTO customer_info VALUES (2, 'Bob', 'Jones', 'Boston', 2008);
=> INSERT INTO customer_info VALUES (3, 'Silke', 'Muller', 'Frankfurt', 2007);
=> INSERT INTO customer_info VALUES (4, 'Simone', 'Bernard', 'Paris', 2014);
=> INSERT INTO customer_info VALUES (5, 'Vijay', 'Kumar', 'New Delhi', 2010);
3. View the table data:
=> SELECT * FROM customer_info;
customer_id | first_name | last_name | city | year
------------+------------+-----------+-----------+------
1 | Joe | Smith | Denver | 2008
2 | Bob | Jones | Boston | 2008
3 | Silke | Muller | Frankfurt | 2007
4 | Simone | Bernard | Paris | 2014
5 | Vijay | Kumar | New Delhi | 2010
4. Create a second table, member_info, that has the same definition as customer_
info:
=> CREATE TABLE member_info (
customer_id INT PRIMARY KEY NOT NULL,
first_name VARCHAR(25),
last_name VARCHAR(35),
city VARCHAR(25),
year INT NOT NULL)
ORDER BY last_name
PARTITION BY year;
5. Insert data into the member_info table:
=> INSERT INTO member_info VALUES (1, 'Jane', 'Doe', 'Miami', 2001);
=> INSERT INTO member_info VALUES (2, 'Mike', 'Brown', 'Chicago', 2014);
=> INSERT INTO member_info VALUES (3, 'Patrick', 'OMalley', 'Dublin', 2008);
=> INSERT INTO member_info VALUES (4, 'Ana', 'Lopez', 'Madrid', 2009);

Restoring Archived Partitions
You can restore partitions that you previously moved to an intermediate table, archived
as an object-level backup, and then dropped.
Note: Restoring an archived partition requires that the original table definition has
not changed since the partition was archived and dropped. If you have changed the
table definition, you can only restore an archived partition using INSERT/SELECT
statements, which are not described here.
These are the steps to restoring archived partitions:
1. Restore the backup of the intermediate table you saved when you moved one or
more partitions to archive (see Archiving Partitions).
2. Move the restored partitions from the intermediate table to the original table.
3. Drop the intermediate table.

About Constraints
Constraints specify rules on what values can go into a column. Some examples of
constraints are:
l Primary or foreign key
l Uniqueness
l Not NULL
l Default values
l Automatically incrementing values
l Values generated by the database
Using constraints can help you maintain data integrity in one or more columns. Do not
define constraints on columns unless you expect to keep the data consistent.
Vertica can use constraints to perform optimizations (such as the optimized MERGE) that
assume the data is consistent.

Adding Constraints
Add constraints on one or more table columns using the following SQL commands:
l CREATE TABLE: Add a constraint on one or more columns.
l ALTER TABLE: Add or drop a constraint on one or more columns.
There are two syntax definitions you can use to add or change a constraint:
l Column-Constraint: Use this syntax when you add a constraint on a column definition
in a CREATE TABLE statement.
l Table-Constraint: Use this syntax when you add a constraint after a column definition
in a CREATE TABLE statement, or when you add, alter, or drop a constraint on a
column using ALTER TABLE.
Vertica recommends naming a constraint but it is optional; if you specify the
CONSTRAINT keyword, you must give a name for the constraint.
The examples that follow illustrate several ways of adding constraints. For additional
details, see:
l Primary Key Constraints
l Foreign Key Constraints
l Unique Constraints
l Not NULL Constraints
Adding Column Constraints with CREATE TABLE
There are several ways to add a constraint on a column using CREATE TABLE:
l On the column definition using the CONSTRAINT keyword, which requires that you
assign a constraint name, in this example, dim1PK:
CREATE TABLE dim1 ( c1 INTEGER CONSTRAINT dim1PK PRIMARY KEY,
c2 INTEGER
);
l On the column definition, omitting the CONSTRAINT keyword. When you omit the

CONSTRAINT keyword, you cannot specify a constraint name:
CREATE TABLE dim1 ( c1 INTEGER PRIMARY KEY,
c2 INTEGER
);
l After the column definition, using the CONSTRAINT keyword and assigning a name,
in this example, dim1PK:
CREATE TABLE dim1 ( c1 INTEGER,
c2 INTEGER,
CONSTRAINT dim1pk PRIMARY KEY(c1)
);
l After the column definition, omitting the CONSTRAINT keyword:
c2 INTEGER,
PRIMARY KEY(c1)
);
Adding Two Constraints on a Column
To add more than one constraint on a column, specify the constraints one after another
when you create the table column. For example, the following statement enforces both
not NULL and unique constraints on the customer_key column, indicating that the
column values cannot be NULL and must be unique:
CREATE TABLE test1 ( id INTEGER NOT NULL UNIQUE,
...
);
Adding a Foreign Key Constraint on a Column
There are four ways to add a foreign key constraint on a column using CREATE
TABLE. The FOREIGN KEY keywords are not valid on the column definition, only after
the column definition:
l On the column definition, use the CONSTRAINT and REFERENCES keywords and
name the constraint, in this example, fact1dim1PK. This example creates a column
with a named foreign key constraint referencing the table (dim1) with the primary key
(c1):

CREATE TABLE fact1 ( c1 INTEGER CONSTRAINT fact1dim1FK REFERENCES dim1(c1),
c2 INTEGER
);
l On the column definition, omit the CONSTRAINT keyword and use the
REFERENCES keyword with the table name and column:
CREATE TABLE fact1 ( c1 INTEGER REFERENCES dim1(c1),
c2 INTEGER
);
l After the column definition, use the CONSTRAINT, FOREIGN KEY, and
REFERENCES keywords and name the constraint:
CREATE TABLE fact1 ( c1 INTEGER,
c2 INTEGER,
CONSTRAINT fk1 FOREIGN KEY(c1) REFERENCES dim1(c1)
);
l After the column definition, omitting the CONSTRAINT keyword:
CREATE TABLE fact1 ( c1 INTEGER,
c2 INTEGER,
FOREIGN KEY(c1) REFERENCES dim1(c1)
);
Each of the following ALTER TABLE statements adds a foreign key constraint on an
existing column, with and without using the CONSTRAINT keyword:
ALTER TABLE fact2
ADD CONSTRAINT fk1 FOREIGN KEY (c1) REFERENCES dim2(c1);
or
ALTER TABLE fact2 ADD FOREIGN KEY (c1) REFERENCES dim2(c1);
For additional details, see Foreign Key Constraints.
Adding Multicolumn Constraints
The following example defines a primary key constraint on multiple columns by first
defining the table columns (c1 and c2), and then specifying both columns in a
PRIMARY KEY clause:
CREATE TABLE dim ( c1 INTEGER,

c2 INTEGER,
PRIMARY KEY (c1, c2)
);
To specify multicolumn (compound) primary keys, the following example uses CREATE
TABLE to define the columns. After creating the table, ALTER TABLE defines the
compound primary key and names it dim2PK:
c2 INTEGER,
c3 INTEGER NOT NULL,
c4 INTEGER UNIQUE
);
ALTER TABLE dim2
ADD CONSTRAINT dim2PK PRIMARY KEY (c1, c2);
In the next example, you define a compound primary key as part of the CREATE TABLE
statement. Then you specify the matching foreign key constraint to table dim2 using
CREATE TABLE and ALTER TABLE:
c2 INTEGER,
c4 INTEGER UNIQUE,
PRIMARY KEY (c1, c2)
);
CREATE TABLE fact2 (
c1 INTEGER,
c2 INTEGER,
c4 INTEGER UNIQUE
);
ALTER TABLE fact2
ADD CONSTRAINT fact2FK FOREIGN KEY (c1, c2) REFERENCES dim2(c1, c2);
Specify a foreign key constraint using a reference to the table that contains the primary
key. In the ADD CONSTRAINT clause, the REFERENCES column names are optional.
The following ALTER TABLE statement is equivalent to the previous ALTER TABLE
statement:
ALTER TABLE fact2 ADD CONSTRAINT fact2FK FOREIGN KEY (c1, c2) REFERENCES dim2;
Adding Constraints on Tables with Existing Data
When you add a constraint on a column with existing data, Vertica:
l Verifies the validity of the column values only if you are adding a PRIMARY or
UNIQUE key enabled for automatic enforcement.

l Does not verify the validity of column values for other constraint types.
If your data does not conform to the declared constraints, your queries could yield
unexpected results.
Use ANALYZE_CONSTRAINTS to check for constraint violations in your column. If you
find violations, use the ALTER COLUMN SET/DROP parameters of the ALTER TABLE
statement to apply or remove a constraint on an existing column.
Note: You can configure your system to automatically enforce primary and unique
key constraints during DML. For information on automatic enforcement, see
Enforcing Primary and Unique Key Constraints Automatically.
Altering Column Constraints
The following example uses ALTER TABLE to add column b with not NULL and default 5
constraints to a table test6:
CREATE TABLE test6 (a INT);
ALTER TABLE test6 ADD COLUMN b INT DEFAULT 5 NOT NULL;
Use ALTER TABLE with the ALTER COLUMN and SET NOT NULL clauses to add the
constraint on column a in table test6:
ALTER TABLE test6 ALTER COLUMN a SET NOT NULL;
Use the SET NOT NULL or DROP NOT NULL clause to add or remove a not NULL column
constraint:
=> ALTER TABLE T1 ALTER COLUMN x SET NOT NULL;
=> ALTER TABLE T1 ALTER COLUMN x DROP NOT NULL;
Use these clauses so that the column has the proper constraints when you have added
or removed a primary key constraint on a column. You can also use them any time you
want to add or remove the NOT NULL constraint.
Note: A PRIMARY KEY constraint includes a NOT NULL constraint. However, if you
drop the PRIMARY KEY constraint on a column, the NOT NULL constraint remains on
that column.

Enforcing Constraints
Note: This section assumes you have not configured your system to automatically
enforce primary and unique key constraints. For primary and unique keys, you can
either enforce constraints with—or without—the option of automatic enforcement. If
you prefer to use automatic enforcement for primary and unique keys, see Enforcing
Primary and Unique Key Constraints Automatically.
To maximize query performance, Vertica checks for primary key and foreign key
violations when loading into the fact table of a pre-join projection. For more details, see
Enforcing Primary Key and Foreign Key Constraints.
Vertica checks for not NULL constraint violations when loading data, but it does not
check for unique constraint violations.
To enforce constraints, load data without committing it by using the COPY with the NO
COMMIT option. Then perform a post-load check using the ANALYZE_CONSTRAINTS
function. If constraint violations are found, you can roll back the load because you have
not committed it. For more details, see Detecting Constraint Violations with ANALYZE_
CONSTRAINTS.
See Also
l ALTER TABLE
l CREATE TABLE
l COPY
l ANALYZE_CONSTRAINTS
Primary Key Constraints
A primary key (PK) is a single column or combination of columns (called a compound
key) that uniquely identifies each row in a table. A primary key constraint contains
unique, non-null values.
When you apply the primary key constraint, the NOT NULL and unique constraints are
added implicitly. You do not need to specify them when you create the column.
However, if you remove the primary key constraint, the NOT NULL constraint continues
to apply to the column. To remove the NOT NULL constraint after removing the primary

key constraint, use the ALTER COLUMN DROP NOT NULL parameter of the ALTER
TABLE statement (see Dropping Constraints).
The following example shows how you can add a primary key constraint on the
employee_id field:
CREATE TABLE employees ( employee_id INTEGER PRIMARY KEY
);
Alternatively, you can add a primary key constraint after the column is created:
CREATE TABLE employees ( employee_id INTEGER
);
ALTER TABLE employees
ADD PRIMARY KEY (employee_id);
Note: If you specify a primary key constraint using ALTER TABLE, the system
returns the following message, which is informational only. The primary key
constraint is added to the designated column.
WARNING 2623: Column "employee_id" definition changed to NOT NULL
You can also use primary keys to constrain more than one column:
CREATE TABLE employees ( employee_id INTEGER,
employee_gender CHAR(1),
PRIMARY KEY (employee_id, employee_gender)
);
When you enable automatic enforcement of primary or unique key constraints, Vertica
applies enforcement for:
l INSERT
l UPDATE
l MERGE
l COPY
l COPY_PARTITIONS_TO_TABLE
l MOVE_PARTITIONS_TO_TABLE
l SWAP_PARTITIONS_BETWEEN_TABLES

Alternatively, rather than automatic enforcement, you can use ANALYZE_
CONSTRAINTS to validate primary and unique key constraints after issuing these
statements. For more information on enabling and disabling primary key constraints,
refer to Enforcing Primary and Unique Key Constraints Automatically
Foreign Key Constraints
A foreign key (FK) is a column that is used to join a table to other tables to ensure
referential integrity of the data. A foreign key constraint requires that a column contain
only values from the primary key column on a specific dimension table.
You can create a foreign key constraint in the CREATE TABLE statement, or you can
define a foreign key constraint using ALTER TABLE.
A column with a foreign key constraint can contain NULL values if it does not also have
a not NULL constraint, even though the NULL value does not appear in the PRIMARY
KEY column of the dimension table. This allows rows to be inserted into the table even
if the foreign key is not yet known.
In Vertica, the fact table's join columns are required to have foreign key constraints in
order to participate in pre-join projections. If the fact table join column has a foreign key
constraint, outer join queries produce the same result set as inner join queries.
You can add a FOREIGN KEY constraint solely by referencing the table that contains
the primary key. The columns in the referenced table do not need to be specified
explicitly.
Examples
Create a table called inventory to store inventory data:
CREATE TABLE inventory ( date_key INTEGER NOT NULL,
warehouse_key INTEGER NOT NULL,
...
);
Create a table called warehouse to store warehouse information:
CREATE TABLE warehouse ( warehouse_key INTEGER NOT NULL PRIMARY KEY,
warehouse_name VARCHAR(20),
...
);

To ensure referential integrity between the inventory and warehouse tables, define a
foreign key constraint called fk_inventory_warehouse on the inventory table that
references the warehouse table:
ALTER TABLE inventory ADD CONSTRAINT fk_inventory_warehouse FOREIGN KEY(warehouse_key)
REFERENCES warehouse(warehouse_key);
In this example, the inventory table is the referencing table and the warehouse table
is the referenced table.
You can also create the foreign key constraint in the CREATE TABLE statement that
creates the inventory table, eliminating the need for the ALTER TABLE statement. If
you do not specify one or more columns, the PRIMARY KEY of the referenced table is
used:
CREATE TABLE inventory (date_key INTEGER NOT NULL, product_key INTEGER NOT NULL,
warehouse_key INTEGER NOT NULL REFERENCES warehouse
(warehouse_key),
...
);
A foreign key can also constrain and reference multiple columns. The following example
uses CREATE TABLE to add a foreign key constraint to a pair of columns:
CREATE TABLE t1 ( c1 INTEGER PRIMARY KEY,
c2 INTEGER,
c3 INTEGER,
FOREIGN KEY (c2, c3) REFERENCES other_table (c1, c2)
);
The following two examples use ALTER TABLE to add a foreign key constraint to a pair
of columns. When you use the CONSTRAINT keyword, you must specify a constraint
name:
ALTER TABLE t
ADD FOREIGN KEY (a, b) REFERENCES other_table(c, d);
ALTER TABLE t
ADD CONSTRAINT fk_cname FOREIGN KEY (a, b) REFERENCES other_table(c, d);
Note: The FOREIGN KEY keywords are valid only after the column definition, not
on the column definition.
Unique Constraints
Unique constraints ensure that the data contained in a column or a group of columns is
unique with respect to all rows in the table.

How to Verify Unique Constraints
Vertica allows you to add a (non-enabled) unique constraint to a column. You can then
insert data into that column, regardless of whether that constraint is not unique with
respect to other values in that column. If your data does not conform to the declared
constraints, your queries could yield unexpected results.
You can use ANALYZE_CONSTRAINTS to check for constraint violations, or you can
enable automatic enforcement of unique key constraints. For more information on
enabling and disabling unique key constraints, refer to Enforcing Primary and Unique
Key Constraints Automatically
Add Unique Column Constraints
There are several ways to add a unique constraint on a column. If you use the
CONSTRAINT keyword, you must specify a constraint name. The following example
adds a UNIQUE constraint on the product_key column and names it product_key_
UK:
CREATE TABLE product ( product_key INTEGER NOT NULL CONSTRAINT product_key_UK UNIQUE,
...
);
Vertica recommends naming constraints, but it is optional:
CREATE TABLE product ( product_key INTEGER NOT NULL UNIQUE,
...
);
You can specify the constraint after the column definition, with and without naming it:
CREATE TABLE product ( product_key INTEGER NOT NULL,
...,
CONSTRAINT product_key_uk UNIQUE (product_key)
);
CREATE TABLE product (
...,
UNIQUE (product_key)
);
You can also use ALTER TABLE to specify a unique constraint. This example names
the constraint product_key_UK:
ALTER TABLE product ADD CONSTRAINT product_key_UK UNIQUE (product_key);
You can use CREATE TABLE and ALTER TABLE to specify unique constraints on
multiple columns. If a unique constraint refers to a group of columns, separate the

column names using commas. The column listing specifies that the combination of
values in the indicated columns is unique across the whole table, though any one of the
columns need not be (and ordinarily isn't) unique:
c2 INTEGER,
c3 INTEGER,
UNIQUE (c1, c2)
);
Not NULL Constraints
A not NULL constraint specifies that a column cannot contain a null value. This means
that new rows cannot be inserted or updated unless you specify a value for this column.
You can apply the not NULL constraint when you create a column in a new table, and
when you add a column to an existing table (ALTER TABLE..ADD COLUMN). You can
also add or drop the not NULL constraint on an existing column:
l ALTER TABLE t ALTER COLUMN x SET NOT NULL
l ALTER TABLE t ALTER COLUMN x DROP NOT NULL
Important: Using the [SET | DROP] NOT NULL clause does not validate whether
column data conforms to the NOT NULL constraint. Use ANALYZE_CONSTRAINTS to
check for constraint violations in a table.
The not NULL constraint is implicitly applied to a column when you add the PRIMARY
KEY (PK) constraint. When you designate a column as a primary key, you do not need
to specify the not NULL constraint.
However, if you remove the primary key constraint, the not NULL constraint still applies
to the column. Use the ALTER COLUMN..DROP NOT NULL clause of the ALTER TABLE
statement to drop the not NULL constraint after dropping the primary key constraint.
The following statement enforces a not NULL constraint on the customer_key column,
specifying that the column cannot accept NULL values.
CREATE TABLE customer ( customer_key INTEGER NOT NULL,
...
);

Dropping Constraints
To drop named constraints, use the ALTER TABLE command.
The following example drops the constraint fact2fk:
=> ALTER TABLE fact2 DROP CONSTRAINT fact2fk;
To drop constraints that you did not assign a name to, query the system table TABLE_
CONSTRAINTS, which returns both system-generated and user-named constraint
names:
=> SELECT * FROM TABLE_CONSTRAINTS;
If you do not specify a constraint name, Vertica assigns a constraint name that is unique
to that table. In the following output, note the system-generated constraint name C_
PRIMARY and the user-defined constraint name fk_inventory_date:
-[ RECORD 1 ]--------+--------------------------
constraint_id | 45035996273707984
constraint_name | C_PRIMARY
constraint_schema_id | 45035996273704966
constraint_key_count | 1
foreign_key_count | 0
table_id | 45035996273707982
foreign_table_id | 0
constraint_type | p
-[ ... ]---------+--------------------------
-[ RECORD 9 ]--------+--------------------------
constraint_id | 45035996273708016
constraint_name | fk_inventory_date
constraint_schema_id | 0
constraint_key_count | 1
foreign_key_count | 1
table_id | 45035996273708014
foreign_table_id | 45035996273707994
constraint_type | f
Once you know the name of the constraint, you can then drop it using the ALTER
TABLE command. (If you do not know the table name, use table_id to retrieve table_
name from the ALL_TABLES table.)
Remove NOT NULL Constraints
When a column is a primary key and you drop the primary key constraint, the column
retains the NOT NULL constraint. To specify that the column now can contain NULL
values, use [DROP NOT NULL] to remove the NOT NULL constraint.
Remove (Drop) a NOT NULL constraint on the column using [DROP NOT NULL]:

ALTER TABLE T1 ALTER COLUMN x DROP NOT NULL;
Important: Using the [SET | DROP] NOT NULL clause does not validate whether
the column data conforms to the NOT NULL constraint. Use ANALYZE_
CONSTRAINTS to check for constraint violations in a table.
Limitations of Dropping Constraints
You cannot drop a primary key constraint if there is another table with a foreign key
constraint that references the primary key.
You cannot drop a foreign key constraint if there are any pre-join projections on the
table.
If you drop a primary or foreign key constraint, the system does not automatically drop
the not NULL constraint on a column. You need to manually drop this constraint if you
no longer want it.
If you drop an enabled PRIMARY or UNIQUE key constraint, the system drops the
associated projection if one was automatically created.
See Also
l ALTER TABLE

Enforcing Primary Key and Foreign Key Constraints
Enforcing (Non-Enabled) Primary Key Constraints
Unless you enable enforcement of primary key constraints, Vertica does not enforce the
uniqueness of primary key values when they are loaded into a table. Thus, a key
enforcement error can occur unless one dimension row uniquely matches each foreign
key value when:
l Data is loaded into a table with a pre-joined dimension.
l The table is joined to a dimension table during a query.
Note: Consider using sequences or auto-incrementing columns for primary key
columns, which guarantees uniqueness and avoids the constraint enforcement
problem and associated overhead. For more information, see Using Sequences.
For information on automatic enforcement of primary key constraints during DML, see
Enforcing Primary and Unique Key Constraints Automatically.
Enforcing Foreign Key Constraints
A table's foreign key constraints are enforced during data load only if there is a pre-join
projection that has that table as its anchor table. If no such pre-join projection exists,
then it is possible to load data that causes a constraint violation. Subsequently, a
constraint violation error can happen when:
l An inner join query is processed.
l An outer join is treated as an inner join due to the presence of foreign key.
l A new pre-join projection anchored on the table with the foreign key constraint is
refreshed.
Detecting Constraint Violations Before You Commit Data
To detect constraint violations, you can load data without committing it using the COPY
statement with the NO COMMIT option, and then perform a post-load check using the
ANALYZE_CONSTRAINTS function. If constraint violations exist, you can roll back the
load because you have not committed it. For more details, see Detecting Constraint
Violations with ANALYZE_CONSTRAINTS.

You can also configure your system to automatically enforce primary and unique key
constraints during DML. For information on automatic enforcement, see Enforcing
Primary and Unique Key Constraints Automatically.
Enforcing Primary and Unique Key Constraints
Automatically
When you create a new constraint with CREATE TABLE or ALTER TABLE, you can
specify whether the constraint will be automatically enforced. You can also alter a
constraint with ALTER TABLE (using the ALTER CONSTRAINT parameter) and
specify whether it will be automatically enforced. You enable or disable individual
constraints specifically using the ENABLED or DISABLED options.
In addition, you can create multi-column constraints with CREATE TABLE or ALTER
TABLE. All constraints are defined at the table level.
By checking any system table with an is_enabled column, you can confirm whether a
PRIMARY or UNIQUE key constraint is currently enabled. The system tables that
include the is_enabled column are CONSTRAINT_COLUMNS, TABLE_
CONSTRAINTS, and PRIMARY_KEYS.
Automatic enforcement applies to current table content and content you later add to the
table.
l Enabling a Constraint on an Empty Table — If you create an enabled constraint on
an empty table, the constraint is enforced on any content you later add to that table.
l Enabling a Constraint on a Populated Table — If you use ALTER TABLE to either
enable an existing constraint or add a new constraint that is enabled, the constraint is
immediately enforced for the current content, and is enforced for content you
subsequently add to the table.
Important: If validation of the current content fails, Vertica completely rolls back the
ALTER TABLE DDL statement that caused the failure.
If you do not specify the ENABLED or DISABLED option when you create a constraint,
the system relies on the setting of the configuration parameters
EnableNewPrimaryKeysByDefault and EnableNewUniqueKeysByDefault. If, for
example, you specifically create a new primary key constraint but do not enable or
disable it, the system relies on the value of the parameter

EnableNewPrimaryKeysByDefault. If the parameter is set to 1 (enabled), the constraint
you created is automatically enforced even though you did not specifically enable it
when you created it.
l Enable or Disable a Constraint When Creating or Altering — You can specifically
enable or disable when you create or alter the constraint. If you do so, the constraint
remains enabled or disabled regardless of the setting of the parameters
EnableNewPrimaryKeysByDefault and EnableNewUniqueKeysByDefault.
l Creating or Altering a Constraint Without Enabling — You can also create a
constraint or alter a constraint without specifically enabling or disabling it using the
ENABLED or DISABLED options. If you do so, Vertica looks at the setting of the
parameters at the moment you create or alter the constraint to determine whether that
constraint is enabled or disabled.
Important: When creating or altering a constraint without enabling it, Vertica uses
the EnableNewPrimaryKeysByDefault and EnableNewUniqueKeysByDefault
settings that are in effect at the time of creation or alteration.
The following figure summarizes primary or unique key constraint enablement.

Enabling or Disabling Automatic Enforcement of Individual
Constraints
To enable or disable individual constraints, use the CREATE TABLE or ALTER
TABLE statement with the ENABLED or DISABLED options, as shown in the following
examples.
The following sample uses ALTER TABLE to create and enable a PRIMARY KEY
constraint on a sample table called mytable.
ALTER TABLE mytable ADD CONSTRAINT primarysample PRIMARY KEY(id)
ENABLED;
The following sample specifically disables the constraint.
ALTER TABLE mytable ALTER CONSTRAINT primarysample DISABLED;
The following sample uses CREATE TABLE to create a PRIMARY KEY constraint
without explicitly enabling it. In this case, the constraint is enabled only if
EnableNewPrimaryKeysByDefault is also enabled. If
EnableNewPrimaryKeysByDefault is set to 1 (enabled), then this constraint is enforced.
If EnableNewPrimaryKeysByDefault is at its default setting (disabled), then this
constraint is not enforced.
CREATE TABLE mytable (id INT PRIMARY KEY);
The following sample uses CREATE TABLE to create a PRIMARY KEY constraint and
enable it. This statement enables the constraint regardless of how you set the parameter
EnableNewPrimaryKeysByDefault.
CREATE TABLE mytable (id INT PRIMARY KEY ENABLED);
Checking Whether Constraints Are Enabled
Use the SELECT statement to list constraints and confirm whether they are enabled or
disabled.
This example shows how you can create a query that lists all tables, constraints, primary
key and unique constraint types. The query also indicates whether the constraints are
enabled or disabled.

select table_name, constraint_name, constraint_type, is_enabled
from v_catalog.constraint_columns where constraint_type in ('p',
'u') order by table_name;
The following output shows the results of this query. The constraint_type column
indicates whether the constraint is a primary key constraint or unique constraint (p or u,
respectively). The is_enabled column indicates whether the constraint is enabled or
disabled (t or f respectively).
table_name | constraint_name | constraint_type | is_enabled
------------+-----------------+-----------------+-----------
table01 | pksample | p | t
table02 | uniquesample | u | f
(2 rows)
The following example is similar but shows how you can create a query that lists
associated columns instead of tables. You could add both tables and columns to the
same query, if you want).
select column_name, constraint_name, constraint_type, is_enabled
from v_catalog.constraint_columns where constraint_type in ('p',
'u') order by column_name;
Sample output follows.
column_name | constraint_name | constraint_type | is_enabled
------------+-----------------+-----------------+-----------
col1_key | pksample | p | t
vendor_key | uniquesample | u | f
(2 rows)
The following example statement shows how to create a sample table with a multi-
column constraint.
create table table09 (column1 int, column2 int, CONSTRAINT
multicsample PRIMARY KEY (column1, column2) ENABLED);
Here's the output listing associated columns.
column_name | constraint_name | constraint_type | is_enabled
------------+-----------------+-----------------+-----------
column1 | multicsample | p | t
column2 | multicsample | p | t

(2 rows)
Choosing Default Enforcement for Newly Declared or Modified
Constraints
The EnableNewPrimaryKeysByDefault and EnableNewUniqueKeysByDefault
parameter settings govern automatic enforcement of PRIMARY and UNIQUE key
constraints.
Important: If you disable enforcement (default), the PRIMARY or UNIQUE key
constraints you create or modify are not enforced unless you specifically enable
them using CREATE TABLE or ALTER TABLE.
You do not need to restart your database once you have set these parameters.
l To enable or disable enforcement of newly created PRIMARY keys, set the
parameter EnableNewPrimaryKeysByDefault. To disable, keep the default setting of
0. To enforce the constraints, set EnableNewPrimaryKeysByDefault to 1 to enable.
ALTER DATABASE VMart SET EnableNewPrimaryKeysByDefault = 1;
l To enable or disable enforcement of newly created constraints for UNIQUE keys, set
the parameter EnableNewUniqueKeysByDefault. To disable, keep the default setting
of 0. Set EnableNewUniqueKeysByDefault to 1 to enable.
ALTER DATABASE VMart SET EnableNewUniqueKeysByDefault = 1;
When you upgrade to Vertica7.0.x, the primary and unique key constraints in any tables
you carry over are disabled. Existing constraints are not automatically enforced. To
enable existing constraints and make them automatically enforceable, manually enable
each constraint using the ALTER TABLE ALTER CONSTRAINT statement. This
statement triggers constraint enforcement for the existing table contents. Statements roll
back if one or more violations occur.
How Enabled Primary and Unique Key Constraints Affect Locks
If you enable automatic constraint enforcement, Vertica uses an Insert-Validate (IV) lock.
The IV lock is needed for operations where the system performs constraint validation for

enabled PRIMARY or UNIQUE key constraints. Such operations can include INSERT,
COPY, MERGE, UPDATE, MOVE_PARTITIONS_TO_TABLE.
How DML Operates with Constraints
With enforced PRIMARY or UNIQUE key constraints, DML operates in two-stages.
l First Stage. The first stage is the same as it would be for an unenforced constraint,
taking, for example, an I lock. (This could be done by several sessions concurrently
loading the same table.)
l Second Stage. The second stage includes the IV lock. Vertica takes an IV lock to
make sure that the data does not violate your constraint. After performing this check,
Vertica can commit the data.
Delays in Bulk Loading Caused by Constraint Validation
In bulk load situations, some transactions could be temporarily blocked while PRIMARY
or UNIQUE key constraints are validated. For example:
You could have three sessions (for example, sessions 1, 2 and 3). Each session
concurrently has an I lock for a bulk load. Session 1 takes an IV lock to validate
constraints. Only one session can hold an IV lock on a given table; other sessions can
continue loading the table while holding I locks.
Sessions 2 and 3 wait for session 1 to validate constraints, and then commit, releasing
the IV lock. (If session 1 fails, the statement rolls back, and the next session can obtain
the IV lock. While sessions can load the table in parallel, an IV lock requires that
sessions takes turns obtaining the IV lock for the final stage of constraint validation.)
For information on lock modes and compatibility and conversion matrices, see Lock
Modes in Vertica Concepts. See also the LOCKS and LOCK_USAGE sections in the
SQL Reference Manual.
Projections for Enabled Primary and Unique Key Constraints
To enforce PRIMARY and UNIQUE key constraints, Vertica creates special key
projections as needed in response to DML or DDL, which are checked for constraint
violations. If a constraint violation occurs, Vertica rolls back the statement and any
special key projection it created. The system returns an error specifying the UNIQUE or
PRIMARY KEY constraint that was violated.
If you have added a constraint on a table that is empty, Vertica does not immediately
create a special key projection for that constraint. Vertica defers creation of a special key

projection until the first row of data is added to the table using a DML or COPY
statement. If you add a constraint to a populated table, Vertica chooses an existing
projection for enforcement of the constraint, if possible. If none of the existing projections
are sufficient to validate the constraint, Vertica creates a new projection for the enabled
constraint.
You can check PRIMARY and UNIQUE key constraint projections by querying the
PROJECTIONS systems table under the V_CATALOG Schema. Each entry applying to
a key constraint projection include the column name IS_KEY_CONSTRAINT_
PROJECTION.
If you drop an enabled PRIMARY or UNIQUE key constraint, the system may drop an
associated projection if one was automatically created. You can drop a specific
projection even if a key constraint is enabled:
l If you drop a specific projection without including the CASCADE option in your
DROP statement, Vertica issues a warning about dropping a projection for an
enabled constraint.
l If you drop a specific projection and include the CASCADE option in your DROP
statement, Vertica drops the projection without issuing the warning.
In either case, the next time Vertica needs to enforce the constraint for DML, the system
creates a new special key projection, unless an existing projection can enforce the
same enabled constraint. The time it takes to regenerate a key projection depends upon
the volume of the table.
Note: If you subsequently use ANALYZE_CONSTRAINTS on a table that has
enabled PRIMARY or UNIQUE key constraints (and thus their associated
projections), ANALYZE_CONSTRAINTS can leverage the projections previously
created for enforcement, resulting in a performance improvement for ANALYZE_
CONSTRAINTS.
Deciding Whether to Enable Primary and Unique Key Constraints
You have the option to choose automatic enforcement of primary and unique key
constraints. Depending upon your specific scenario, you can either enable this feature,
or check constraints using ANALYZE_CONSTRAINTS. Consider these factors:

l Benefits of Enabling Primary and Unique Key Constraints
l Considerations Before Enabling Primary and Unique Key Constraints
l Where Constraints Are Enforced
l Impact of Floating Point Values in Primary Keys When Using Automatic Enforcement
l What Constraint Enforcement Does Not Control
For more information on using ANALYZE_CONSTRAINTS, see the Administrator's
Guide section, Detecting Constraint Violations with ANALYZE_CONSTRAINTS.
Benefits of Enabling Primary and Unique Key Constraints
When you enable primary and unique key constraints, Vertica validates data before it is
inserted. Because you do not need to check data using ANALYZE_CONSTRAINTS
after insertion, query speed improves.
Having enabled key constraints, particularly on primary keys, can help the optimizer
produce faster query plans, particularly for joins. When a table has an enabled primary
key constraint, the optimizer can assume that it has no rows with duplicate values
across the key set.
Vertica automatically creates special purpose projections, if necessary, to enforce
enabled key constraints. In some cases Vertica can use an existing projection instead.
Considerations Before Enabling Primary and Unique Key Constraints
Multiple factors affect performance. The enforcement process can slow DML and bulk
loading.
If you are doing bulk loads, consider the size of your tables and the number of columns
in your keys. You could decide to disable automatic enforcement for fact tables, which
tend to be larger, but enable enforcement for dimension tables. For fact tables, you
could choose manual key constraint validation using ANALYZE_CONSTRAINTS, and
avoid the load-time overhead of automatic validation.
When you enable automatic enforcement of primary or unique key constraints,
statement rollbacks occur if validation fails during DML. Vertica completely rolls back
the statement causing the failure. When deciding to enable automatic enforcement of
primary or unique key constraints, consider the impact of statements rolling back on
violations. For example, you issue ten insert statements, none of which have committed.

If the sixth statement introduces a duplicate, that statement is rolled back. The other
statements that do not introduce duplicates can commit.
Note: Vertica performs primary and unique key constraint enforcement at the SQL
statement level rather than the transaction level. You cannot defer primary or unique
key enforcement until transaction commit.
Where Constraints Are Enforced
Automatic enforcement of PRIMARY and UNIQUE key constraints occurs in:
l INSERT statements — Both in single row insertions, and in an INSERT statement
that includes the SELECT parameter.
l Bulk loads — On bulk loads that use the COPY statement.
l UPDATE or MERGE statements — All UPDATE and MERGE statements.
l Meta functions — On COPY_PARTITIONS_TO_TABLE, MOVE_PARTITIONS_TO_
TABLE and SWAP_PARITIONS_BETWEEN_TABLES.
l ALTER TABLE statements — On statements that include either the ADD
CONSTRAINT or ALTER CONSTRAINT parameters where you are enabling a
constraint and the table has existing data.
Impact of Floating Point Values in Primary Keys When Using Automatic
Enforcement
Vertica allows NaN, +Inf, and -Inf values in a FLOAT type column, even if the column is
part of a primary key. Because FLOAT types provide imprecise arithmetic, Vertica
recommends that you not use columns with floating point values within primary keys.
If you do decide to use a FLOAT type within a primary key, note the following in regards
to primary key enforcement. (This behavior is the same regardless of whether you
enable an automatic constraint or check constraints manually with ANALYZE_
CONSTRAINTS.)
l For the purpose of enforcing key constraints, Vertica considers two NaNs, (or two
+Inf, or two –Inf) values to be equal.
l If a table has an enabled single column primary key constraint of type FLOAT, only
one tuple can have a NaN value for the column. Otherwise, the constraint is violated.

This is also true for +Inf and –Inf values. Note that this differs from the IEEE 754
standard, which specifies that multiple NaN values are different from each other.
l A join on a single column that contains FLOAT values fails if the table that includes a
primary key contains multiple tuples with two NaNs (or +Inf, or –Inf) values.
For information on floating point type, see DOUBLE PRECISION (FLOAT).
What Constraint Enforcement Does Not Control
You can only enable or disable automatic enforcement for primary or unique keys.
Vertica does not support automatic enforcement of foreign keys and referential integrity,
except where the table includes a pre-join projection. You can manually validate foreign
key constraints using the ANALYZE_CONSTRAINTS meta-function.
Vertica does not support automatic enforcement of primary or unique keys on external
tables.
Limitations on Using Automatic Enforcement for Local and Global
Temporary Tables
This section includes limitations and related notes on using automatic enforcement of
primary and unique key constraints with local and global temporary tables. Find general
information on temporary tables, refer to About Temporary Tables.
Limitations for Local and Global Temporary Tables
Vertica displays an error message if you add an enabled constraint to a local or global
temporary table that contains data. Vertica displays the error because it cannot create
projections for enabled constraints on a temporary table if that table is already populated
with data.
Limitations Specific to Global Temporary Tables
You cannot use ALTER TABLE to add a new or enable an existing primary or unique
key constraint on a global temporary table. Use CREATE TABLE to enable a constraint
on a global temporary table.
You can use ALTER TABLE to add a new or enable an existing primary or unique key
constraint on a local temporary table if the local temporary table is empty.
Note: You can use ALTER TABLE to disable an already enabled primary or unique
key constraint on a global temporary table.

Detecting Constraint Violations with ANALYZE_
CONSTRAINTS
Use the ANALYZE_CONSTRAINTS function to manually validate table constraints.
Ways to Use ANALYZE_CONSTRAINTS
You can use ANALYZE_CONSTRAINTS instead of (or as a supplement to) automatic
PRIMARY and UNIQUE key enforcement. For information on automatic enforcement of
PRIMARY and UNIQUE keys, see Enforcing Primary and Unique Key Constraints
Automatically.
If you do enable PRIMARY or UNIQUE key constraints, note that ANALYZE_
CONSTRAINTS does not check whether constraints are disabled or enabled. You can
use ANALYZE_CONSTRAINTS where:
l PRIMARY or UNIQUE key constraints are disabled.
l Enabled and disabled constraints are mixed.
You can use ANALYZE_CONSTRAINTS to validate referential integrity of foreign keys.
Vertica does not support automatic enforcement of foreign keys, except in cases where
there is a pre-join projection.
How to Use ANALYZE_CONSTRAINTS to Detect Violations
The ANALYZE_CONSTRAINTS function analyzes and reports on constraint violations
within the current schema search path. To check for constraint violations:
l Pass an empty argument to check for violations on all tables within the current
schema.
l Pass a single table argument to check for violations on the specified table.
l Pass two arguments, a table name and a column or list of columns, to check for
violations in those columns.
See the examples in ANALYZE_CONSTRAINTS for more information.
Impact of Floating Point Values In Primary Keys When Using
ANALYZE_CONSTRAINTS
Vertica allows NaN, +Inf, and -Inf values in a FLOAT type column, even if the column is
part of a primary key. Because FLOAT types provide imprecise arithmetic, Vertica

recommends that you not use columns with floating point values within primary keys.
If you do decide to use a FLOAT type within a primary key, note the following in regards
to primary key enforcement. (This behavior is the same regardless of whether you
enable an automatic constraint or check constraints manually with ANALYZE_
CONSTRAINTS.)
l For the purpose of enforcing key constraints, Vertica considers two NaNs, (or two
+Inf, or two –Inf) values to be equal.
l If a table has an enabled single column primary key constraint of type FLOAT, only
one tuple can have a NaN value for the column. Otherwise, the constraint is violated.
This is also true for +Inf and –Inf values. Note that this differs from the IEEE 754
standard, which specifies that multiple NaN values are different from each other.
l A join on a single column that contains FLOAT values fails if the table that includes a
primary key contains multiple tuples with two NaNs (or +Inf, or –Inf) values.
For information on floating point type, see DOUBLE PRECISION (FLOAT).
Fixing Constraint Violations
When Vertica finds duplicate primary key or unique values at run time, use the
DISABLE_DUPLICATE_KEY_ERROR function to suppress error messaging. Queries
execute as though no constraints are defined on the schema and the effects are session
scoped.
Caution: When called, DISABLE_DUPLICATE_KEY_ERROR suppresses data
integrity checking and can lead to incorrect query results. Use this function only after
you insert duplicate primary keys into a dimension table in the presence of a pre-join
projection. Correct the violations and reenable integrity checking with REENABLE_
DUPLICATE_KEY_ERROR.
The following series of commands create a table named dim and the corresponding
projection:
CREATE TABLE dim (pk INTEGER PRIMARY KEY, x INTEGER);
CREATE PROJECTION dim_p (pk, x) AS SELECT * FROM dim ORDER BY x UNSEGMENTED ALL NODES;
The next two statements create a table named fact and the pre-join projection that
joins fact to dim.

CREATE TABLE fact(fk INTEGER REFERENCES dim(pk));
CREATE PROJECTION prejoin_p (fk, pk, x) AS SELECT * FROM fact, dim WHERE pk=fk ORDER BY x;
The following statements load values into table dim. The last statement inserts a
duplicate primary key value of 1:
INSERT INTO dim values (1,1);
INSERT INTO dim values (1,2); --Constraint violation
COMMIT;
Table dim now contains duplicate primary key values, but you cannot delete the
violating row because of the presence of the pre-join projection. Any attempt to delete
the record results in the following error message:
ROLLBACK: Duplicate primary key detected in FK-PK join Hash-Join (x dim_p), value 1
In order to remove the constraint violation (pk=1), use the following sequence of
commands, which puts the database back into the state just before the duplicate primary
key was added.
To remove the violation:
1. Save the original dim rows that match the duplicated primary key:
CREATE TEMP TABLE dim_temp(pk integer, x integer);
INSERT INTO dim_temp SELECT * FROM dim WHERE pk=1 AND x=1; -- original dim row
2. Temporarily disable error messaging on duplicate constraint values:
SELECT DISABLE_DUPLICATE_KEY_ERROR();
Caution: Remember that running the DISABLE_DUPLICATE_KEY_ERROR
function suppresses the enforcement of data integrity checking.
3. Remove the original row that contains duplicate values:
DELETE FROM dim WHERE pk=1;
4. Allow the database to resume data integrity checking:

SELECT REENABLE_DUPLICATE_KEY_ERROR();
5. Reinsert the original values back into the dimension table:
INSERT INTO dim SELECT * from dim_temp;
COMMIT;
6. Validate your dimension and fact tables.
If you receive the following error message, it means that the duplicate records you
want to delete are not identical. That is, the records contain values that differ in at
least one column that is not a primary key; for example, (1,1) and (1,2).
ROLLBACK: Delete: could not find a data row to delete (data integrity violation?)
The difference between this message and the rollback message in the previous
example is that a fact row contains a foreign key that matches the duplicated primary
key, which has been inserted. A row with values from the fact and dimension table is
now in the pre-join projection. In order for the DELETE statement (Step 3 in the
following example) to complete successfully, extra predicates are required to
identify the original dimension table values (the values that are in the pre-join).
This example is nearly identical to the previous example, except that an additional
INSERT statement joins the fact table to the dimension table by a primary key value
of 1:
INSERT INTO fact values (1); -- New insert statement joins fact with dim on primary key
value=1
INSERT INTO dim values (1,2); -- Duplicate primary key value=1
COMMIT;
To remove the violation:
1. Save the original dim and fact rows that match the duplicated primary key:
CREATE TEMP TABLE dim_temp(pk integer, x integer);
CREATE TEMP TABLE fact_temp(fk integer);
INSERT INTO dim_temp SELECT * FROM dim WHERE pk=1 AND x=1; -- original dim row

INSERT INTO fact_temp SELECT * FROM fact WHERE fk=1;
2. Temporarily suppresses the enforcement of data integrity checking:
SELECT DISABLE_DUPLICATE_KEY_ERROR();
3. Remove the duplicate primary keys. These steps also implicitly remove all fact rows
with the matching foreign key.
4. Remove the original row that contains duplicate values:
DELETE FROM dim WHERE pk=1 AND x=1;
Note: The extra predicate (x=1) specifies removal of the original (1,1) row,
rather than the newly inserted (1,2) values that caused the violation.
5. Remove all remaining rows:
DELETE FROM dim WHERE pk=1;
6. Reenable integrity checking:
SELECT REENABLE_DUPLICATE_KEY_ERROR();
7. Reinsert the original values back into the fact and dimension table:
INSERT INTO dim SELECT * from dim_temp;
INSERT INTO fact SELECT * from fact_temp;
COMMIT;
8. Validate your dimension and fact tables.
Reenabling Error Reporting
If you ran DISABLE_DUPLICATE_KEY_ERROR to suppress error reporting while
fixing duplicate key violations, you can get incorrect query results going forward. As
soon as you fix the violations, run the REENABLE_DUPLICATE_KEY_ERROR
function to restore the default behavior of error reporting.
The effects of this function are session scoped.

Using Text Search
Text Search allows you to quickly search the contents of a single CHAR, VARCHAR,
LONG VARCHAR, VARBINARY, or LONG VARBINARY field within a table to locate a
specific keyword. Currently, only American English is supported.
You can use this feature on columns that are queried repeatedly regarding their
contents. After you create the text index, DML operations become slightly slower on the
source table. This performance change results from syncing the text index and source
table. Any time an operation is performed on the source table, the text index updates in
the background. Regular queries on the source table are not affected.
The text index contains all of the words from the source table's text field, as well as any
other additional columns you included during index creation. Additional columns are not
indexed, their values are just passed through to the text index. The text index is like any
other table in the HPE Vertica Analytics Platform, except it is linked to the source table
internally.
You first create a text index on the table you plan to search. Then, after you have
indexed your table, you can run a query against the text index for a specific keyword.
This query returns a doc_id for each instance of the keyword. After querying the text
index, joining the text index back to the source table should give a significant
performance improvement over directly querying the source table about the contents of
its text field.
Important: Do not alter the contents or definitions of the text index. If the contents or
definitions of the text index are altered, then the results do not appropriately match
the source table.
Creating a Text Index
In the following example, you perform a text search using a source table called t_log.
This source table has two columns:
l One column containing the table's primary key
l Another column containing log file information

You must associate a projection with the source table. Use a projection that is sorted by
the primary key and either segmented by hash(id) or unsegmented. You can define this
projection on the source table, along with any other existing projections.
Create a text index on the table for which you want to perform a text search.
=> CREATE TEXT INDEX text_index ON t_log (id, text);
The text index contains two columns:
l doc_id uses the unique identifier from the source table.
l token is populated with text strings from the designated column from the source table.
The word column results from tokenizing and stemming the words found in the text
column.
If your table is partitioned then your text index also contains a third column named
partition.
=> SELECT * FROM text_index;
token | doc_id | partition
------------------------+--------+-----------
<info> | 6 | 2014
<warning> | 2 | 2014
<warning> | 3 | 2014
<warning> | 4 | 2014
<warning> | 5 | 2014
database | 6 | 2014
execute: | 6 | 2014
object | 4 | 2014
object | 5 | 2014
[catalog] | 4 | 2014
[catalog] | 5 | 2014
You create a text index on a source table only once. In the future, you do not have to re-
create the text index each time the source table is updated or changed.
Your text index stays synchronized to the contents of the source table through any
operation that is run on the source table. These operations include, but are not limited
to:
l COPY
l INSERT
l UPDATE

l DELETE
l DROP PARTITION
l MOVE_PARTITION_TO_TABLE
When you move or swap partitions in a source table that is indexed, verify that the
destination table already exists and is indexed in the same way.
Creating a Text Index on a Flex Table
In the following example, you create a text index on a flex table. The example assumes
that you have created a flex table called mountains. See Getting Started in Using Flex
Tables to create the flex table used in this example.
Before you can create a text index on your flex table, add a primary key constraint to the
flex table.
=> ALTER TABLE mountains ADD PRIMARY KEY (__identity__);
Create a text index on the table for which you want to perform a text search. Tokenize
the __raw__column with the FlexTokenizer and specify the data type as LONG
VARBINARY. It is important to use the FlexTokenizer when creating text indices on flex
tables because the data type of the __raw__ column differs from the default
StringTokenizer.
=> CREATE TEXT INDEX flex_text_index ON mountains(__identity__, __raw__) TOKENIZER
public.FlexTokenizer(long varbinary);
The text index contains two columns:
l doc_id uses the unique identifier from the source table.
l token is populated with text strings from the designated column from the source table.
The word column results from tokenizing and stemming the words found in the text
column.
If your table is partitioned then your text index also contains a third column named
partition.
=> SELECT * FROM flex_text_index;
token | doc_id
-------------+--------
50.6 | 5
Mt | 5

Washington | 5
mountain | 5
12.2 | 3
15.4 | 2
17000 | 3
29029 | 2
Denali | 3
Helen | 2
Mt | 2
St | 2
mountain | 3
volcano | 2
29029 | 1
34.1 | 1
Everest | 1
mountain | 1
14000 | 4
Kilimanjaro | 4
mountain | 4
(21 rows)
You create a text index on a source table only once. In the future, you do not have to re-
create the text index each time the source table is updated or changed.
Your text index stays synchronized to the contents of the source table through any
operation that is run on the source table. These operations include, but are not limited
to:
l COPY
l INSERT
l UPDATE
l DELETE
l DROP PARTITION
l MOVE_PARTITION_TO_TABLE
When you move or swap partitions in a source table that is indexed, verify that the
destination table already exists and is indexed in the same way.
Searching a Text Index
After you create a text index, write a query to run against the index to search for a
specific keyword.
In the following example, you use a WHERE clause to search for the keyword
<WARNING> in the text index. The WHERE clause should use the stemmer you used

to create the text index. When you use the STEMMER keyword, it stems the keyword to
match the keywords in your text index. If you did not use the STEMMER keyword, then
the default stemmer is v_txtindex.StemmerCaseInsensitive. If you used
STEMMER NONE, then you can omit STEMMER keyword from the WHERE clause.
=> SELECT * FROM text_index WHERE token = v_txtindex.StemmerCaseInsensitive('<WARNING>');
token | doc_id
-----------+--------
<warning> | 2
<warning> | 3
<warning> | 4
<warning> | 5
(4 rows)
Next, write a query to display the full contents of the source table that match the keyword
you searched for in the text index.
=> SELECT * FROM t_log WHERE id IN (SELECT doc_id FROM text_index WHERE token = v_
txtindex.StemmerCaseInsensitive('<WARNING>'));
id | date | text
---+------------+----------------------------------------------------------------------------------
-------------
4 | 2014-06-04 | 11:00:49.568 unknown:0x7f9207607700 [Catalog] <WARNING> validateDependencies:
Object 45035968
Object 45030
Object 4503
Object 45066
(4 rows)
Use the doc_id to find the exact location of the keyword in the source table.The doc_id
matches the unique identifier from the source table. This matching allows you to quickly
find the instance of the keyword in your table.
Performing a Case-Sensitive and Case-Insensitive Text Search
Query
Your text index is optimized to match all instances of words depending upon your
stemmer. By default, the case insensitive stemmer is applied to all text indices that do
not specify a stemmer. Therefore, if the queries you plan to write against your text index
are case sensitive, then Hewlett Packard Enterprise recommends you use a case
sensitive stemmer to build your text index.
The following examples show queries that match case-sensitive and case-insensitive
words that you can use when performing a text search.
This query finds case-insensitive records in a case insensitive text index:

=> SELECT * FROM t_log WHERE id IN (SELECT doc_id FROM text_index WHERE token = v_
txtindex.StemmerCaseInsensitive('warning'));
This query finds case-sensitive records in a case sensitive text index:
=> SELECT * FROM t_log_case_sensitive WHERE id IN (SELECT doc_id FROM text_index WHERE token = v_
txtindex.StemmerCaseSensitive('Warning'));
Including and Excluding Keywords in a Text Search Query
Your text index also allows you to perform more detailed queries to find multiple
keywords or omit results with other keywords. The following example shows a more
detailed query that you can use when performing a text search.
In this example, t_log is the source table, and text_index is the text index. The query
finds records that either contain:
l Both the words '<WARNING>' and 'validate'
l Only the word '[Log]' and does not contain 'validateDependencies'
SELECT * FROM t_log where (
id IN (SELECT doc_id FROM text_index WHERE token = v_txtindex.StemmerCaseSensitive('<WARNING>'))
AND ( id IN (SELECT doc_id FROM text_index WHERE token = v_txtindex.StemmerCaseSensitive
('validate')
OR id IN (SELECT doc_id FROM text_index WHERE token = v_txtindex.StemmerCaseSensitive('[Log]
')))
AND NOT (id IN (SELECT doc_id FROM text_index WHERE token = v_txtindex.StemmerCaseSensitive
('validateDependencies'))));
This query returns the following results:
id | date | text
----+------------+---------------------------------------------------------------------------------
---------------
11 | 2014-05-04 | 11:00:49.568 unknown:0x7f9207607702 [Log] <WARNING> validate: Object 4503 via
fld num_all_roles
13 | 2014-05-04 | 11:00:49.568 unknown:0x7f9207607706 [Log] <WARNING> validate: Object 45035
refers to root_i3
14 | 2014-05-04 | 11:00:49.568 unknown:0x7f9207607708 [Log] <WARNING> validate: Object 4503 refers
to int_2
17 | 2014-05-04 | 11:00:49.568 unknown:0x7f9207607700 [Txn] <WARNING> Begin validate Txn: fff0ed17
catalog editor
(4 rows)
Dropping a Text Index
Dropping a text index removes the specified text index from the database.
You can drop a text index when:

l It is no longer queried frequently.
l An administrative task needs to be performed on the source table and requires the
text index to be dropped.
Dropping the text index does not drop the source table associated with the text index.
However, if you drop the source table associated with a text index, then that text index is
also dropped. Vertica considers the text index a dependent object.
The following example illustrates how to drop a text index named text_index:
=> DROP TEXT INDEX text_index;
DROP INDEX
Stemmers and Tokenizers
Vertica provides default stemmers and tokenizers. You can also create your own custom
stemmers and tokenizers. The following topics explain the default stemmers and
tokenizers, and the requirements for creating custom stemmers and tokenizers in
Vertica.
l Vertica Stemmers
l Vertica Tokenizers
l Configuring a Tokenizer
l Requirements for Custom Stemmers and Tokenizers
Vertica Stemmers
Vertica stemmers use the Porter stemming algorithm to find words derived from the
same base/root word. For example, if you perform a search on a text index for the
keyword database, you might also want to get results containing the word databases.
To achieve this type of matching, Vertica stores words in their stemmed form when
using any of the v_txtindex stemmers.
The HPE Vertica Analytics Platform provides the following stemmers:
Name Description
v_txtindex.Stemmer(long varchar) Not sensitive to case; outputs lowercase
words. Stems strings from a Vertica table.

Name Description
Alias of StemmerCaseInsensitive.
v_txtindex.StemmerCaseSensitive
(long varchar)
Sensitive to case. Stems strings from a Vertica
table.
v_txtindex.StemmerCaseInsensitive
(long varchar)
Default stemmer used if no stemmer is
specified when creating a text index.
Not sensitive to case; outputs lowercase
words. Stems strings from a Vertica table.
v_
txtindex.caseInsensitiveNoStemming
(long varchar)
Not sensitive to case; outputs lowercase
words. Does not use the Porter Stemming
algorithm.
Examples
The following examples show how to use a stemmer when creating a text index.
Create a text index using the StemmerCaseInsensitive stemmer:
=> CREATE TEXT INDEX idx_100 ON top_100 (id, feedback) STEMMER v_txtindex.StemmerCaseInsensitive
(long varchar)
TOKENIZER v_txtindex.StringTokenizer
(long varchar);
Create a text index using the StemmerCaseSensitive stemmer:
=> CREATE TEXT INDEX idx_unstruc ON unstruc_data (__identity__, __raw__) STEMMER v_
txtindex.StemmerCaseSensitive(long varchar)
TOKENIZER
public.FlexTokenizer(long varbinary);
Create a text index without using a stemmer:
=> CREATE TEXT INDEX idx_logs FROM sys_logs ON (id, message) STEMMER NONE TOKENIZER v_
txtindex.StringTokenizer(long varchar);
Vertica Tokenizers
The HPE Vertica Analytics Platform provides the following pre-configured tokenizers:
Name Description
public.FlexTokenizer(long
varbinary)
Splits the values in your Flex Table by white
space.

Name Description
v_txtindex.StringTokenizer(long
varchar)
Splits the string into words by splitting on white
space.
v_
txtindex.AdvancedLogTokenizer
Uses the default parameters for all tokenizer
parameters. For more information, see Advanced
Log Tokenizer.
v_txtindex.BasicLogTokenizer Uses the default values for all tokenizer
parameters except minorseparator, which is set
to an empty list. For more information, see Basic
Log Tokenizer.
v_
txtindex.WhitespaceLogTokenizer
Uses default values for tokenizer parameters,
except for majorseparators, which uses E'
tnfr'; and minorseparator, which uses an
empty list. For more information, see Whitespace
Log Tokenizer.
Examples
The following examples show how you can use a default tokenizer when creating a text
index.
Use the StringTokenizer to create an index from the top_100:
=> CREA

Hp vertica 7.2.x_complete_documentation

Hp vertica 7.2.x_complete_documentation

More Related Content

What's hot (17)

Viewers also liked (17)

Similar to Hp vertica 7.2.x_complete_documentation (20)

More from Eric Javier Espino Man (14)

Recently uploaded (20)

Hp vertica 7.2.x_complete_documentation