Course Introduction
Getting intellectuals ready to become Big Data Experts!
This hands-on training of Seven (7) days being led by industry experts aims to open up the advance career opportunities for attendees to be SQL developers, data analysts, business intelligence specialists, developers, system architects, and database administrators. In the course, attendees will be given extensive hands-on practice on advance Big Data tools and technologies such as Hadoop, Cloudera, Hive, Sqoop etc.
This course will teach students:
- How to Extract, Transfer, Load (ETL) processes to prepare data from a MySQL database into HDFS using Sqoop.
- Use Data Definition Language (DDL) statements to create or alter structures in the meta store for use by Hive and Impala.
- Use Query Language statements in Hive and Impala to analyze data on a cluster.
Course Audience
The following course is designed for
- Career newbies, Recent graduates, third year and final year students from the Computer Science/ IT/Software Engineering disciplines.
- Professionals from the computer science domain who want to shift the profession to Big Data, i.e. Business Intelligence experts, Data Scientist, Data Analysts.
- Executives who want to build the initial knowledge about the impact of the Big Data ecosystem on their organizational growth.
Course Schedule
Syllabus - What you will learn from this course
Introduction to big data
- Introduction to Analytics & Architecture
- What is High Performance Computing
- What is streaming data
- What is visualization
- What is Big Data
- Your first Big Data application on AWS
Introduction to data analysis, Storage & processing solutions
- Data analytics and data analysis concepts
- Introduction to the challenges of data analytics
- Introduction to Amazon S3
- Introduction to data lakes
Storage & processing solutions
- Introduction to data storage methods
- Introduction to data processing methods
- Introduction to batch data processing
- Introduction to stream data processing
Data structure and types
- Introduction to source data storage
- Introduction to structured data stores
- Introduction to semi-structured and unstructured data stores
- Understanding data integrity
- Understanding database consistency
- Introduction to ETL process
- Introduction to analyzing data
- Introduction to visualizing data
Big Data analytics & architecture
- Big Data Analytics on Amazon Web Services (AWS)
- Introduction to Amazon EMR
- Getting started with real-time data analytics on AWS
- Getting started with real-time streaming data in under 5 minutes
- Big Data on AWS – structures, unstructured streaming
- Evolving your Big Data use cases from batch to real-time
- Building Big Data solutions with Amazon EMR and Amazon Redshift
- Defining Big Data
- Example Big Data Stacks
- Big Data Framework | Hadoop Tutorial for Beginners
- Big Data Architectural Patterns and Best Practices on AWS
- Architectural Patterns for Big Data on AWS
Big Data ,HPC & Streaming
- High Performance Computing (HPC) with Amazon Web Services
- High Performance Computing in the Cloud with AWS and Cycle Computing
- Large Scale Processing and Huge Data sets
- What is a Data Stream?
- What is Streaming Data?
- What Is Amazon Kinesis Data Streams?
- Perform Basic Stream Operations
- Creating a Stream
Software development & Platform Technologies
- Architecture
- DevOps
- Programming Languages
- Scripting Languages
- Mobile Applications
- Web Development
- Software architecture
- Software development processes and methodologies: definition
- Software architecture and design
- Introduction to programming
- Overview of main programming languages
- Introduction to C, C#, C++, .NET, Java, Python and others
- Introduction to operating systems and virtualization
- How is virtualization used in the cloud?