‹ Back To Training

HAWQ Architecture and Implementation

Timeline: 4 Days

Prerequisites

  • HAWQ Introduction

Topics

Expand All › ‹ Collapse All

  • Master, Segment Hosts, Segments
  • Process Components
  • Interconnect
  • Redundancy and Failover
  • Environment Variables
  • Essential configuration
  • Session configuration variables
  • pSQL
  • SQL Lexicon and Expressions
  • Schemas, tables, and data Loading
  • Joins
  • Arrays and Array Aggregates
  • Window Functions
  • Other Functions
  • UDF’s, UDA’s
  • Databases, Schemas, Tables, and Objects
  • Templates
  • Creating tables
  • Customizing table definition
  • HDFS Overview
  • HDFS and HAWQ table storage
  • HDFS Admin highlights
  • Data Loading overview
  • Loading data internal to HDFS
  • Loading data external to HDFS
  • Unloading Data
  • Best practices
  • What is PXF
  • Load, query, unload
  • Features and differentiators
  • Steps
  • Components
  • Profiles
  • Troubleshooting
  • Running Queries in HAWQ
  • Query Planning and Dispatch
  • HAWQ SQL Query caveats
  • HAWQ Query Plans
  • Parallel Query Execution
  • Query Workflow
  • Unsupported Features
  • Distribution behavior
  • Types of Data Distribution
  • Impact on motions
  • JOIN keys and Data TYPES
  • Data Skew
  • Query Execution Skew
  • Practical data loading
  • Table Partitioning objectives
  • Multi-level partitioning
  • Loading data into Partitioned tables
  • Exchanging Partitions
  • pg_partitions view
  • Partition elimination
  • Practical data loading
  • Reading a query plan
  • EXPLAIN and EXPLAIN ANALYZE
  • Query plan operators
  • Optimizing queries, joins, sorts & aggregation
  • Optimizing Query Memory and Spill Files
  • Optimizer behavior
  • Handling sub-queries
  • Handling Common Table Expressions
  • DML and CREATE TABLE AS
  • Statistics Collection