Introduction to Big Data Training

Rating:
1 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 51 vote, average: 5.00 out of 5
Loading...
Please Log in or register to rate

Introduction to Big Data Training

BD-350

The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB (Mongo DB), Column Based DB (Cassandra DB, HBase) AND In Memory Data Grids (XAP and Apache Spark).

This training introduces the popular NOSQL and HADOOP current solutions and provides basic hands on experience on each of the solutions themselves.

Audience

Target Audience:
System Administrators, Operations, Database Administrators, Support, DevOps, Developers

Prerequisites:
Relational Database SQL experience

Course Topics

Day 1: Lesson 1: Course Introduction
– Course Introduction
– Courseware walkthrough
– Documentation
– Lab

Day 1: Lesson 2: Introduction to Big Data 
– What is Big Data?
– Big Data challenges and complexity
– General concepts
– Architecture considerations
– Presenting use cases of internet companies (e.g. Facebook)
– The Data Scientist
– RDBMS: Advantages and disadvantages / Impedance Mismatch
– No-SQL vs. Traditional Enterprise Relational Data:
– CAP theorem vs. ACID / Dynamic schema, sharding, replications and caching / Performance
– Availability vs Consistency
– No-SQL types and use cases
– When (not) to use No-SQL?

Day 1+2: Lesson 3: Introduction to Hadoop and HDFS
– Hadoop Pseudo-Distributed Mode installation
– Lab: Hadoop Installation
– HDFS Assumptions and Goals
– Scale and Feature requirements
– Architecture
– Data Replication
– Robustness
– Data Organization
– Accessibility
– Space Reclamation
– Lab (Install Configure and Experience)

Day 2: Lesson 4: Hadoop Map Reduce
– Map Reduce Basics
– Inputs and Outputs
– Map Reduce Code Example
– Map Reduce Additional Details
– Map Reduce V1 vs. V2
– YARN: Yet Another Resource Negotiator
– Lab (Install Configure and Experience)

Day 3: Lesson 5: Introduction to In Memory Data Grids
– Why in Memory Data Grid?
– IMDG Terminology Comparison to Common Platforms and Servers
– IMGD (XAP) Runtime Environment
– XAP Application Components
– XAP Space Topologies
– Configuring your Environment
– XAP Web Dashboard
– XAP Management Center (gs-ui)
– Lab (Install Configure and Experience)

Day 3: Lesson 6: Introduction to Apache SPARK 
– Apache SPARK Introduction
– Getting Started
– SPARK architecture
– SPARK processing
– Map Reduce
– Example
– Lab (Install Configure and Experience)

Day 4: Lesson 7: Introduction to Document Databases – Mongo DB
– Mongo DB Introduction
– Getting Started – Installation
– Mongo DB basic commands
– Aggregation
– Replication
– Sharding
– Sharded Cluster Requirements and configuration
– Shard Cluster Deployment
– Other considerations
– Lab (Install Configure and Experience)

Day 4: Lesson 8: Introduction to Column Based Databases – Apache Cassandra DB 
– Cassandra DB Introduction
– Getting Started
– Google Big Table
– Amazon Dynamo
– Cassandra Query Language Shell – CQLSH
– Replication & Partitioning
– Basic administration
– Cluster configuration
– Lab (Install Configure and Experience)

Day 5: Lesson 9: Introduction to Column Based Databases – Hadoop HBase DB + NOSQL Comparison 
– HBase DB Introduction
– Getting Started
– HBase Data Model
– HBase Shell and Basic Command
– Physical Model
– Architecture
– Cluster configuration
– Lab (Install Configure and Experience)
– NO SQL DBs General Comparison
– Performance Comparison
– When to use which?
– Lab (Install Configure and Experience)

Day 5: Lesson 10: Other Hadoop Components is short 
– HIVE
– PIG
– HUE – UI
– Sqoop and Flume – Data integration
– Oozie – Workflow
– Zookeeper – distributed cluster manager and scheduling
– Connectors
– Final Lab Session (Combine all the pieces)
– Summary

Wrap Up Day 5: Lesson 11: The Data Architect + Summary 
– The Data Architect
– Every Data element should be analyzed
– Data Characteristics (Read rarely, Read once, Read many, Write Once,
– Write Many (Updated after inserted)
– Read if Exist
– Final Lab Session (Combine all the pieces)
– Summary
– Wrap Up

© Copyright - Skilit - Site by Dweb