‹ Back To Training

Hadoop Administration

Timeline: 4 Days

Topics

Expand All › ‹ Collapse All

  • A Brief History of Hadoop
  • Core Hadoop Components
  • Fundamental Concepts
  • General Planning Considerations
  • Choosing Hardware
  • Network Considerations
  • Configuring Nodes
  • Planning for Cluster Management
  • HDFS Features
  • Writing and Reading Files
  • NameNode Considerations
  • HDFS Security
  • Namenode Web UI
  • Hadoop File Shell
  • Pulling data from External Sources with Flume
  • Importing Data from Relational Databases with Sqoop
  • REST Interfaces
  • Best Practices
  • MapReduce Overview
  • Features of MapReduce
  • Architectural Overview
  • YARN ­ MapReduce Version 2
  • Failure Recovery
  • The JobTracker Web UI
  • Configuration and Deployment Types
  • Installing Hadoop
  • Specifying the Hadoop Configuration
  • Initial HDFS and MapReduce Configuration
  • Log Files
  • What is a Hadoop Client?
  • Installing and Configuring Hadoop Clients
  • Installing and Configuring Hue
  • Hue Authentication and Configuration
  • Advanced Configuration Parameters
  • Configuring Hadoop Ports
  • Explicitly Including and Excluding Hosts
  • Configuring HDFS for Rack Awareness and HDFS High Availability
  • Why Hadoop Security is Important
  • Hadoop’s Security System Concepts
  • What Kerberos is and How it Works
  • Securing a Hadoop Cluster with Kerberos
  • Managing Running Jobs
  • Scheduling Hadoop Jobs
  • Configuring the FairScheduler
  • Checking HDFS Status
  • Copying Data Between Clusters
  • Adding/Removing Cluster Nodes
  • Rebalancing the Cluster
  • NameNode Metadata Backup
  • Cluster Upgrades
  • General System Monitoring
  • Managing Hadoop’s Log Files
  • Monitoring the Clusters
  • Common Troubleshooting Issues