Please accept cookies to help us improve this website Is this OK? Yes No More on cookies »
Item number: 132047804

Big Data Concept + Tools + Techniques Training

Item number: 132047804

Big Data Concept + Tools + Techniques Training

398,00 481,58 Incl. tax

Big Data Concept + Tools + Techniques E-Learning Training Online Certified Teachers Quizzes Assessments Test Exam Live Labs Tips Tricks Certificate.

Read more
Brand:
Big Data
Discounts:
  • Buy 2 for €390,04 each and save 2%
  • Buy 3 for €386,06 each and save 3%
  • Buy 4 for €382,08 each and save 4%
  • Buy 5 for €378,10 each and save 5%
  • Buy 10 for €358,20 each and save 10%
  • Buy 25 for €338,30 each and save 15%
  • Buy 50 for €318,40 each and save 20%
Availability:
In stock
Delivery time:
Ordered before 5 p.m.! Start today.
  • Award Winning E-learning
  • Lowest price guarantee
  • Personalized service by our expert team
  • Pay safely online or by invoice
  • Order and start within 24 hours

Big Data Concept + Tools + Techniques E-Learning

In the modern world, data is being generated at an exponential rate. Business data generation is increasing at a similarly rapid rate. Only a small percentage of business data is structured data in rows and columns of databases. This data proliferation requires a rethinking of traditional techniques for capture, storage, and processing. Big data is a term that describes data sets so big they can’t be managed with traditional database systems. Big Data is also a collection of tools and techniques aimed at solving these problems.

Learning Kits are structured learning paths, mainly within the Emerging Tech area. A Learning Kit keeps
the student working toward an overall goal, helping them to achieve your career aspirations. Each part takes the student step by step through a diverse set of topic areas. Learning Kits are made up of required tracks, which contain all of the learning resources available such as Assessments (Final Exams), Mentor, Practice Labs and of course E learning. And all resources with a 365 days access from first activation.

This Learning Kit with more than 25 hours of learning is divided into three tracks:

Course content

Big Data Infrastructures

In this learning, the focus will be on big data concepts, non-relational data, and big data analytics.

Courses (7 hours +)

The Big Data Technology Wave

Big Data in Perspective

Course: 17 Minutes

  • Course Introduction
  • Introducing Big Data
  • The Biggest Wave Yet
  • Emerging Technologies

Global Data

Course:14 Minutes

  • Defining Big Data
  • Key Terms for Data
  • Sizing Big Data

The Key Contributors

Course: 10 Minutes

  • The Original Key Contributors
  • The Distro Companies

The Apache Software Foundation

Course: 10 Minutes

  • Apache Software Foundation
  • Apache Projects
  • Other Apache Projects
  • Other Open Source Projects

Big Data Stack

Course: 13 Minutes

  • The Big Data Stack
  • Big Data Components
  • NoSQL Databases

Hadoop in Detail

Course: 31 Minutes

  • Distributed Computing
  • Design Principles of Hadoop
  • Functional View of Hadoop
  • HDFS in Action
  • Yarn in Action
  • MapReduce in Action
  • Spark in Action

Practice: Big Data elements and functions

Course: 15 Minutes

  • Exercise: Working with Big Data Elements

Big Data Opportunities and Challenges

Big Data Teams

Course: 28 Minutes

  • Course Introduction
  • The Big Data Team
  • Business Team Members
  • Analytics Team Members
  • Data Solutions Team Members
  • Cluster Team Members
  • Big Data Impacting IT

Big Data Projects

Course: 25 Minutes

  • DIY Supercomputing
  • Hadoop in the Clouds
  • Big Data and Data Warehouses
  • Business Case for Big Data
  • Big Data and RDBMS
  • Data Center Projects

Big Data Use Cases

Course: 20 Minutes

  • Data Analytics
  • Big Data Engines
  • Common Analytics Use Cases
  • Big Data Impacting the Globe

Opportunities and Challenges

Course: 32 Minutes

  • Global Increasing Digital Volume
  • The Big Companies
  • Big Data Opportunity
  • Big Data Challenges
  • Challenges of Security and Privacy
  • Planning for Big Data
  • Big Data Impacting Business
  • Practice: Challenges and Opportunities of Big Data
  • Exercise: Challenges and Opportunities of Big Data

Big Data Concepts: Getting to Know Big Data

Course: 43 Minutes

  • Course Overview
  • What Is Big Data?
  • Sources of Big Data
  • Characteristics of Big Data
  • Structured and Unstructured Data
  • Big Data Analytics
  • Advantages of Big Data Analytics
  • Big Data Analytics: Domain Use Cases
  • Big Data Analytics: Netflix Use Case
  • Big Data Analytics: Amazon Use Case
  • Major Challenges in Big Data
  • Course Summary

Big Data Concepts: Big Data Essentials

Course: 46 Minutes

  • Course Overview
  • Raw Data and Big Data
  • Data Warehousing and Big Data
  • Big Data Computing Systems
  • Horizontal and Vertical Scaling
  • Features, Benefits, and Use Cases of Hadoop
  • Hadoop: Components
  • Hadoop: Migration to the Cloud
  • Hadoop and Cloud Computing
  • Features of Big Data Storage Systems
  • In-memory Storage Systems
  • Course Summary

Non-relational Data: Non-relational Databases

Course: 52 Minutes

  • Course Overview
  • Non-relational Databases
  • The NoSQL Approach
  • Benefits of NoSQL
  • Document Databases
  • Key-value Data Stores
  • Graph Databases
  • Columnar Databases
  • HBase Architecture
  • Multi-model Databases
  • Next Generation NewSQL Databases
  • Course Summary

Big Data Analytics: Techniques for Big Data Analytics

Course: 39 Minutes

  • Course Overview
  • Big Data Analytics Challenges
  • Big Data Analytics Stack Layers
  • Big Data Ingestion
  • The Data Processing Layer
  • The Data Storage Layer
  • Pillars of Big Data Architecture
  • Batch Processing and Big Data
  • Stream Processing and Big Data
  • Lambda Architecture and Use Cases
  • Kappa Architecture
  • Course Summary

Big Data Analytics: Spark for High-speed Big Data Analytics

Course: 51 Minutes

  • Course Overview
  • The Core Characteristics of Apache Spark
  • Components of the Apache Spark Architecture
  • Apache Spark Use Case: Uber Using Spark
  • Apache Spark Use Case: Alibaba Using Spark
  • Apache Spark Use Case: The Healthcare Industry
  • Apache Spark vs. Hadoop
  • Top Apache Spark Use Cases
  • Apache Spark's Main Features
  • Apache Spark Performance Optimization Techniques
  • Apache Spark Best Practices
  • Course Summary

Harnessing Data Volume & Velocity: Big Data to Smart Data

Course: 39 Minutes

  • Course Overview
  • Comparing Big Data and Smart Data
  • Smart Data and Edge Technologies
  • Big Data to Smart Data Formation
  • Smart Data and Smart Processes
  • Smart Data Use Cases
  • Smart Data Life Cycle
  • Big Data to Smart Data Using k-NN
  • Smart Data Frameworks
  • Smart Data to Business
  • Clustering Smart Data
  • Smart Data Integration
  • Exercise: Transform Big Data to Smart Data

Securing Big Data Streams

Course: 1 Hour, 3 Minutes

  • Course Overview
  • Big Data Security Concerns
  • Streaming Data Security Concerns
  • NoSQL Database Security Concerns
  • Distributed Processing Security Risks
  • Data Mining and Analytics Privacy Flaws
  • End-Point Device Tampering Risks
  • Secure Big Data
  • Secure Data Streams
  • Secure Data In Motion
  • End-Point Input Validation and Filtering
  • Secure Data at Rest with Symmetric Ciphers
  • Exercise: Securing Big Data Streams

Assessment:

  • Big Data Infrastructures

Emerging New Age Architectures

In this learning, the focus will be on cloud data platforms, data lakes, and modern warehouses.

Courses (5 hours +)

Cloud Data Platforms: Cloud Computing

Course: 52 Minutes

  • Course Overview
  • Cloud Computing and Its Characteristics
  • Cloud Computing: Use Cases and Benefits
  • Cloud Computing Services: Storage and Compute Power
  • Types of Cloud Compute Power
  • Types of Cloud Storage
  • Cloud Computing Models: PaaS, IaaS, SaaS, and FaaS
  • Cloud Computing Model Comparison
  • Components of Cloud Computing Architectures
  • Cloud Service Provider Comparison
  • Cloud Elasticity and Scalability
  • Course Summary

Cloud Data Platforms: Cloud-based Applications & Storage

Course: 53 Minutes

  • Course Overview
  • Deploying Applications on Cloud Platforms
  • Characteristics of Cloud-ready Applications
  • Types of Cloud Deployment Models
  • Cloud Deployment Tools
  • Considerations for Cloud Application Deployment
  • CPU Virtualization, Memory, and I/O Devices
  • Cloud Storage Platforms
  • Cloud Storage Technologies
  • HDFS and Amazon S
  • Types of Data Centers
  • Course Summary

Cloud Data Platforms: AWS, Azure, & GCP Comparison

Course: 56 Minutes

  • Course Overview
  • Cloud Data Platforms: Amazon Web Services
  • Cloud Data Platforms: Microsoft Azure
  • Cloud Data Platforms: Google Cloud Platform
  • Cloud Analytics
  • Popular Cloud Analytics Tools
  • Cloud Computing Challenges: Security
  • Cloud Computing Challenges: Compliance
  • Cloud Computing Challenges: Cost Management
  • Cloud Computing Challenges: Governance
  • Future of Cloud Computing
  • Course Summary

Data Lakes and Modern Data Warehouses: Data Lakes

Course: 1 Hour, 19 Minutes

  • Course Overview
  • Data Lake Evolution
  • Modern Data Lake Architecture
  • Data Lakes: Key Concepts
  • Data Lake Maturity Stages
  • Data Swamps
  • Data Lake Platforms
  • Data Lake Platforms
  • Governed Data Lakes
  • Data Lakes: Risks and Challenges
  • Data Lakes vs. Data Warehouses
  • Course Summary

Data Lakes and Modern Data Warehouses: Modern Data Warehouses

Course: 1 Hour, 10 Minutes

  • Course Overview
  • Data Warehouses and Its Characteristics
  • Modern Data Warehouses: Key Concepts and Stages
  • Amazon Redshift
  • Google BigQuery
  • Modern Data Warehouses: Architecture and Processes
  • Modern Data Warehouses: Techniques
  • Data Warehouse Solutions: Batch Processing
  • Data Warehouse Solutions: Real-time Processing
  • Data Warehouse Solutions: Streaming Analytics
  • Hybrid Modern Data Warehouse
  • Course Summary

Data Lakes and Modern Data Warehouses: Azure Databricks & Data Pipelines

Course: 1 Hour, 2 Minutes

  • Course Overview
  • Azure Databricks: Features and Architecture
  • Azure Databricks: Pros and Cons
  • Snowflake Data Warehouses: Features and Architecture
  • Snowflake Data Warehouses: Pros and Cons
  • Data Pipelines
  • Components of a Data Pipeline
  • Advantages of a Data Pipeline
  • Types of Data Pipeline Tools
  • Comparing Data Pipeline Tools
  • Building a Data Pipeline
  • Course Summary

Assessment:

Emerging New Age Architectures

Apache Spark

Explore the basics of Apache Spark, an analytics engine used for big data processing.

Courses

Accessing Data with Spark (3 hours+)

Accessing Data with Spark: An Introduction to Spark

Course: 1 Hour, 7 Minutes

  • Course Overview
  • Introduction to Spark and Hadoop
  • Resilient Distributed Datasets (RDDs)
  • RDD Operations
  • Spark DataFrames
  • Spark Architecture
  • Spark Installation
  • Working with RDDs
  • Creating DataFrames from RDDs
  • Contents of a DataFrame
  • The SQLContext
  • The map() Function of an RDD
  • Accessing the Contents of a DataFrame
  • DataFrames in Spark and Pandas
  • Exercise: Working with Spark

Accessing Data with Spark: Data Analysis Using the Spark DataFrame API

Course: 1 Hour, 12 Minutes

  • Course Overview
  • Performance Improvements in Spark
  • Broadcast Variables and Accumulators
  • Loading Data into a DataFrame
  • Sampling the Contents of a DataFrame
  • Grouping and Aggregations
  • Visualizing Data in a DataFrame
  • Trimming and Cleaning Data
  • User-Defined Functions and DataFrames
  • Combining Filters, Aggregations, and Sorting
  • Using Broadcast Variables
  • Using Accumulators
  • Exporting DataFrame Contents
  • Custom Accumulators
  • Join Operations
  • Exercise: Data Analysis Using the DataFrame API

Accessing Data with Spark: Data Analysis using Spark SQL

Course: 55 Minutes

  • Course Overview
  • The Spark Catalyst Optimizer
  • Introduction to Spark SQL
  • Preparing Data for Analysis
  • Running SQL Queries
  • Inferred and Explicit Schemas
  • Windowing in Spark
  • Applying Window Functions
  • Exercise: Data Analysis Using Spark SQL

Big Data Development with Apache Spark (5 hours+)

Introduction to Apache Spark

Course: 1 Hour, 2 Minutes

  • Course Introduction
  • Overview of Apache Spark
  • Downloading and Installing Apache Spark
  • Downloading and Installing Apache Spark on Mac OS
  • Building Spark
  • Working with Spark Shell
  • Linking to Spark
  • Spark Configuration
  • Initializing Apache Spark
  • Running Spark on Clusters

Apache Spark SQL

Course: 1 Hour, 10 Minutes

  • Course Introduction
  • Apache Spark SQL Overview
  • SparkSession
  • DataFrames
  • Aggregations
  • SQL Queries
  • Temporary View
  • Datasets
  • JSON Datasets
  • Load/Save Functions
  • Specifying a Data Source
  • Querying with SQL
  • SaveMode
  • Parquet Files
  • Persistent Tables
  • Partitioning

Structured Streaming

Course: 1 Hour, 13 Minutes

  • Course Introduction
  • Structured Streaming Overview
  • Stream Input
  • Stream Output
  • Windowing
  • Continuous Applications
  • Deduplication
  • File Sinks
  • Streaming Query
  • Streaming Query Manager
  • Checkpointing
  • Word Count

Spark Monitoring and Tuning

Course: 59 Minutes

Monitoring Spark Applications

Course: 17 Minutes

  • Course Introduction
  • Web UI
  • Environment Configuration
  • REST API
  • Memory Allocation

Tuning Spark Applications

Course: 38 Minutes

  • Speculation
  • Serialization
  • Memory Tuning
  • Executor Memory
  • Garbage Collection Tuning
  • Parallelism
  • Broadcast Functionality
  • Explain Query Execution
  • Data Compression

Practice: Monitoring Spark Applications

Course: 4 Minutes

  • Exercise: Monitor Spark Applications4

Spark Security

Course: 36 Minutes

  • Course Introduction
  • Spark UI
  • Secure Event Logs
  • SSL Settings
  • Shared Secret
  • YARN Deployments
  • SASL Encryption
  • Network Security

Practice: Configuring Spark Security

Course: 3 Minutes

  • Exercise: Configure Spark Security

Practice Lab: 
Developing with Apache Spark (5 hours)

Practice developing with Apache Spark by performing tasks with Spark SQL, Spark Streaming, and GraphX. Then create a classification system using MLib and work with MLib Regression. 

Apache Hadoop
Apache Hadoop is an open-source framework for the storage and processing of big data.

Courses

Getting Started with Hadoop (5 hours+)

Introduction to Apache Spark

Course: 1 Hour, 2 Minutes

  • Course Introduction
  • Overview of Apache Spark
  • Downloading and Installing Apache Spark
  • Downloading and Installing Apache Spark on Mac OS
  • Building Spark
  • Working with Spark Shell
  • Linking to Spark
  • Spark Configuration
  • Initializing Apache Spark
  • Running Spark on Clusters

Apache Spark SQL

Course: 1 Hour, 10 Minutes

  • Course Introduction
  • Apache Spark SQL Overview
  • SparkSession
  • DataFrames
  • Aggregations
  • SQL Queries
  • Temporary View
  • Datasets
  • JSON Datasets
  • Load/Save Functions
  • Specifying a Data Source
  • Querying with SQL
  • SaveMode
  • Parquet Files
  • Persistent Tables
  • Partitioning

Structured Streaming

Course: 1 Hour, 13 Minutes

  • Course Introduction
  • Structured Streaming Overview
  • Stream Input
  • Stream Output
  • Windowing
  • Continuous Applications
  • Deduplication
  • File Sinks
  • Streaming Query
  • Streaming Query Manager
  • Checkpointing
  • Word Count

Spark Monitoring and Tuning

Course: 59 Minutes

Monitoring Spark Applications

Course: 17 Minutes

  • Course Introduction
  • Web UI
  • Environment Configuration
  • REST API
  • Memory Allocation

Tuning Spark Applications

Course: 38 Minutes

  • Speculation
  • Serialization
  • Memory Tuning
  • Executor Memory
  • Garbage Collection Tuning
  • Parallelism
  • Broadcast Functionality
  • Explain Query Execution
  • Data Compression

Practice: Monitoring Spark Applications

Course: 4 Minutes

  • Exercise: Monitor Spark Applications

Spark Security

Course: 36 Minutes

  • Course Introduction
  • Spark UI
  • Secure Event Logs
  • SSL Settings
  • Shared Secret
  • YARN Deployments
  • SASL Encryption
  • Network Security

Practice: Configuring Spark Security

Course: 3 Minutes

  • Exercise: Configure Spark Security

Working with Hadoop HDFS (3 hours+)

Hadoop HDFS: Introduction

Course: 1 Hour, 15 Minutes

  • Course Overview
  • Scaling Datasets
  • Horizontal Scaling for Big Data
  • Distributed Clusters and Horizontal Scaling
  • Overview of HDFS
  • HDFS Architectures
  • MapReduce for HDFS
  • YARN for HDFS
  • The Mechanism of Resource Allocation in Hadoop
  • Apache Zookeeper for HDFS
  • The Hadoop Ecosystem
  • Exercise: An Introduction to HDFS

Hadoop HDFS: Introduction to the Shell

Course: 53 Minutes

  • Course Overview
  • Creating a Hadoop Cluster on the Google Cloud
  • Exploring Hadoop Clusters
  • The YARN Cluster Manager UI
  • The HDFS NameNode UIs
  • Browsing the Packaged Hadoop Tools
  • Configuring HDFS
  • The HDFS Shells
  • Exercise: Introduction to the HDFS Shell

Hadoop HDFS: Working with Files

Course: 48 Minutes

  • Course Overview
    Basic Directory Commands in HDFS
  • Using the copyFromLocal Command in HDFS
  • Using the put Command in HDFS
  • Using the copyToLocal Command in HDFS
  • Retrieving files from HDFS
  • Append and Delete Operations in HDFS
  • Exercise: Working with Files on HDFS

Hadoop HDFS: File Permissions

Course: 49 Minutes

  • Course Overview
  • The HDFS count and du Commands
  • Viewing and Setting File Permissions in HDFS
  • Applying Permissions Recursively in HDFS
  • An Introduction to Bash Scripting
  • Scripting HDFS Operations
  • Exploring the HDFS NameNode UI
  • Cleanup Operations in HDFS

Data Warehousing with Hadoop (4 hours+)

Data Warehousing with Hadoop: Managing Big Data Using HDInsight Hadoop

Course: 1 Hour, 6 Minutes

  • Features of HDInsight
  • Fundamentals and Types of Clusters in HDInsight
  • Essential Opensource Components of HDInsight
  • Setting Up Hadoop Clusters on Azure HDInsight
  • HDInsight Clusters with Resource Manager Template
  • HDInsight Services and Storage Types
  • Azure Management Console
  • Creating and Managing HDInsight Clusters
  • Setting Up HDInsight Emulator
  • Programming in HDInsight
  • Developing and Executing MapReduce Program
  • Exercise: Working with HDInsight and MapReduce

Data Warehousing with Hadoop: Microsoft Analytics Platform System and Hive

Course: 1 Hour, 29 Minutes

  • Microsoft Analytics Platform System
  • Understanding PolyBase
  • Parallel Data Warehouse Architecture
  • Data Exploration Architectures
  • Hive Introduction
  • Hive Architecture in HDInsight
  • Setting up the Development Environment for Hive
  • Connect and Submit Queries
  • Hive QL
  • Using Azure PowerShell and Beeline
  • Creating a Database and Tables and Loading Data
  • Partition Tables and Data Formats
  • Hue Installation and Hive Query Management
  • Using Microsoft BI and Hive
  • Hive as ETL
  • HBase and Hive
  • Exercise: Creating and Loading Data into Hive Tables

Data Warehousing with Hadoop: HDInsight and Retail Sales Implementation Using Hive

Course: 46 Minutes

  • Data Modeling
  • Dimensional Design Process
  • Dimensional Design Steps
  • Retail Business Use Cases
  • Dimension Tables
  • Fact tables
  • Data Loading in Dimension and Fact Tables
  • Essential Queries
  • Creating and Executing Queries
  • Hive and Power BI for Visualization

Data Warehousing with Hadoop: Spark, HDInsight and Cluster Management

Course: 56 Minutes

  • Spark Introduction
  • Data Representation in Spark
  • Create Spark Clusters Using PowerShell
  • Spark SQL and Hive
  • Spark SQL Data Sources and DataFrames
  • Customizing HDInsight Cluster
  • Application Installation on HDInsight
  • Ambari User Management
  • HDInsight Management Using Azure CLI
  • Troubleshooting HDInsight
  • Monitoring HDInsight Hadoop
  • Exercise: Working with Spark and Ambari
Language English
Qualifications of the Instructor Certified
Course Format and Length Teaching videos with subtitles, interactive elements and assignments and tests
Lesson duration 25 Hours
Assesments The assessment tests your knowledge and application skills of the topics in the learning pathway. It is available 365 days after activation.
Online Virtuele labs Receive 12 months of access to virtual labs corresponding to traditional course configuration. Active for 365 days after activation, availability varies by Training
Online mentor You will have 24/7 access to an online mentor for all your specific technical questions on the study topic. The online mentor is available 365 days after activation, depending on the chosen Learning Kit.
Progress monitoring Yes
Access to Material 365 days
Technical Requirements Computer or mobile device, Stable internet connections Web browsersuch as Chrome, Firefox, Safari or Edge.
Support or Assistance Helpdesk and online knowledge base 24/7
Certification Certificate of participation in PDF format
Price and costs Course price at no extra cost
Cancellation policy and money-back guarantee We assess this on a case-by-case basis
Award Winning E-learning Yes
Tip! Provide a quiet learning environment, time and motivation, audio equipment such as headphones or speakers for audio, account information such as login details to access the e-learning platform.

There are no reviews written yet about this product.

Loading...

OEM Office Elearning Menu Genomineerd voor 'Beste Opleider van Nederland'

OEM Office Elearning Menu is trots genomineerd te zijn voor de titel 'Beste Opleider van Nederland' door Springest, een onderdeel van Archipel. Deze erkenning bevestigt onze kwaliteit en toewijding. Hartelijk dank aan al onze cursisten.

Reviews

There are no reviews written yet about this product.

25.000+

Deelnemers getrained

Springest: 9.1 - Edubookers 8.9

Gemiddeld cijfer

3500+

Aantal getrainde bedrijven

20+

Jaren ervaring

Even more knowledge

Read our most recent articles

View blog