Please accept cookies to help us improve this website Is this OK? Yes No More on cookies »
Item number: 107752732

Apache Spark Advanced Topics Training

Item number: 107752732

Apache Spark Advanced Topics Training

149,00 180,29 Incl. tax

Order this unique Training E-Learning course Apache Spark Advanced Topics online, 1 year 24/7 access to rich interactive videos and tests.

Read more
Discounts:
  • Buy 2 for €146,02 each and save 2%
  • Buy 3 for €144,53 each and save 3%
  • Buy 5 for €138,57 each and save 7%
  • Buy 10 for €134,10 each and save 10%
  • Buy 25 for €126,65 each and save 15%
  • Buy 50 for €116,22 each and save 22%
  • Buy 100 for €104,30 each and save 30%
  • Buy 200 for €74,50 each and save 50%
Availability:
In stock
Delivery time:
Ordered before 5 p.m.! Start today.
  • Award Winning E-learning
  • Lowest price guarantee
  • Personalized service by our expert team
  • Pay safely online or by invoice
  • Order and start within 24 hours

Apache Spark Advanced Topics E-Learning

Order this unique E-Learning Apache Spark Advanced Topics course online, 1 year 24/7 access to rich interactive videos, voice, progress monitoring through reports tests.
Explore Apache Spark, the open-source cluster computing framework that provides a fault-tolerant
programming interface for clusters.

Apache Spark is an open source, big data processing framework built around speed, ease of use, and sophisticated analytics. In this learning path, you will learn about the more advanced features of Spark Core, Spark Streaming, Spark SQL, MLlib, GraphX, and SparkR.

Course content

Spark RDDs

Course: 1 Hour, 13 Minutes

  • Course Introduction
  • Review of Spark Stack
  • Defining Lazy Evaluation
  • Examining RDD Lineage
  • Pre-partitioning RDDs
  • Storing RDDs in Serialized Form
  • Performing Numeric Operations
  • Creating Custom Accumulators
  • Optimizing Broadcasts
  • Piping to External Applications
  • Tuning Garbage Collection
  • Performing Batch Importing
  • Determining Memory Consumption
  • Tuning Data Structures
  • Minimizing Memory Usage of Reduce Tasks
  • Setting the Levels of Parallelism

Data Frames and Spark SQL

Course: 43 Minutes

  • Creating DataFrames
  • Interoperating with RDDs
  • Examining the Load and Save Functions
  • Reading and Writing Parquet Files
  • Using JSON Dataset as a DataFrame
  • Reading and Writing Data in Hive Tables
  • Reading and Writing Data Using JDBC
  • Running Thrift JDBC/ODBC Server
  • Practice: Tuning Spark

Course: 9 Minutes

Exercise: Tuning Spark

  • Privacy and Cookie PolicyTerms of Use

Streaming Analytics

Course: 54 Minutes

  • Course Introduction
  • Examining Discretized Streams
  • Ingesting TCP Socket Input Streams
  • Reading File Input Streams
  • Receiving Akka Actor Input Streams
  • Consuming Kafka Input Streams
  • Ingesting Flume Input Streams
  • Setting Up Kinesis Input Streams
  • Configuring Twitter Input Streams
  • Implementing Custom Input Streams
  • Describing Receiver Reliability

Transformations on DStreams

Course: 1 Hour, 19 Minutes

  • Using UpdateStateByKey Operations
  • Performing Transform Operations
  • Performing Window Operations
  • Performing Join Operations
  • Using Output Operations on DStreams
  • Using Data Frames and SQL Operations
  • Using Learning Algorithms with MLlib
  • Persisting Stream Data in Memory
  • Enabling and Configuring Checkpointing
  • Deploying Applications
  • Monitoring Applications
  • Reducing Batch Processing Times

Performance Tuning

Course: 19 Minutes

  • Setting Batch Intervals
  • Tuning Memory Usage
  • Examining the Semantics of Fault Tolerance

Practice: Transformations on Dstreams

Course: 6 Minutes

  • Exercise: Perform Transformations on DStreams

Machine Learning with MLlib

Course: 1 Hour, 12 Minutes

  • Course Introduction
  • Describing Data Types
  • Examining Basic Statistics
  • Exploring Linear SVMs
  • Performing Logistic Regression
  • Using Naive Bayes
  • Creating Decision Trees
  • Using Collaborative Filtering with ALS
  • Clustering with K-means
  • Clustering with Latent Dirichlet Allocation (LDA)
  • Analyzing with Frequent Pattern Mining

GraphX

Course: 57 Minutes

  • Examining the Property Graph
  • Exploring the Graph Operators
  • Performing Analytics with Neighborhood Aggregation
  • Messaging with Pregel API
  • Building Graphs
  • Examining Vertex and Edge RDDs
  • Optimizing Representation Through Partitioning
  • Measuring Vertices with PageRank

R and Spark

Course: 37 Minutes

  • Installing SparkR
  • Running SparkR
  • Using Existing R Packages
  • Exposing RDDs as Distributed Lists
  • Interoperating with DataFrames
  • Using Parquet Files
  • Running on a Cluster

Practice: Use MLlib

Course: 10 Minutes

  • Exercise: Use MLlib
Language English
Qualifications of the Instructor Certified
Course Format and Length Teaching videos with subtitles, interactive elements and assignments and tests
Lesson duration 7:42 Hours
Progress monitoring Yes
Access to Material 365 days
Technical Requirements Computer or mobile device, Stable internet connections Web browsersuch as Chrome, Firefox, Safari or Edge.
Support or Assistance Helpdesk and online knowledge base 24/7
Certification Certificate of participation in PDF format
Price and costs Course price at no extra cost
Cancellation policy and money-back guarantee We assess this on a case-by-case basis
Award Winning E-learning Yes
Tip! Provide a quiet learning environment, time and motivation, audio equipment such as headphones or speakers for audio, account information such as login details to access the e-learning platform.

There are no reviews written yet about this product.

Loading...

OEM Office Elearning Menu Genomineerd voor 'Beste Opleider van Nederland'

OEM Office Elearning Menu is trots genomineerd te zijn voor de titel 'Beste Opleider van Nederland' door Springest, een onderdeel van Archipel. Deze erkenning bevestigt onze kwaliteit en toewijding. Hartelijk dank aan al onze cursisten.

Reviews

There are no reviews written yet about this product.

25.000+

Springest: 9.1 - Edubookers 9.0

3500+

20+