Apache Spark Programming (Spark 105): 5 half-day Live-Online Public Class (Europe)

San Francisco, California
Monday, August 06, 2018
Databricks
Apache Spark Programming (Spark 105): 5 half-day Live-Online Public Class (Europe)
Monday, August 06, 2018 9:00 AM -
Friday, August 10, 2018 1:00 PM (GMT)

Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, California 94105
United States

Map and Directions

Overview

This five-day online course will be delivered from Monday, August 06, to Friday, August 10, 2018, from 9:00am to 1:00pm BST each day.

This ​course ​is ​designed ​for ​data ​engineers, ​analysts, ​architects; ​software ​engineers; ​IT ​operations; ​and ​technical ​managers ​interested ​in ​a ​thorough, ​hands-on ​overview ​of ​Apache ​Spark. ​ ​This ​course ​covers ​the ​same ​material ​as ​our ​three-day ​Apache ​Spark ​Programming ​course.

The ​course ​covers ​the ​core ​APIs ​for ​using ​Spark, ​fundamental ​mechanisms ​and ​basic ​internals ​of ​the ​framework, ​SQL ​and ​other ​high-level ​data ​access ​tools, ​as ​well ​as ​Spark’s ​streaming ​capabilities ​and ​machine ​learning ​APIs.

Each ​topic ​includes ​slide ​and ​lecture ​content ​along ​with ​hands-on ​use ​of ​Spark ​through ​an ​elegant ​web-based ​notebook ​environment. ​Inspired ​by ​tools ​like ​IPython/Jupyter, ​notebooks ​allow ​attendees ​to ​code ​jobs, ​data ​analysis ​queries, ​and ​visualizations ​using ​their ​own ​Spark ​cluster, ​accessed ​through ​a ​web ​browser. ​All ​class ​code ​is ​directly ​usable ​with ​pure ​open-source ​Spark ​or ​any ​commercial ​Spark ​distribution.

Objectives

After ​taking ​this ​class ​you ​will ​be ​able ​to:

  • Describe ​Spark’s ​fundamental ​mechanics
  • Use ​the ​core ​Spark ​APIs ​to ​operate ​on ​data
  • Articulate ​and ​implement ​typical ​use ​cases ​for ​Spark
  • Build ​data ​pipelines ​with ​SparkSQL ​and ​DataFrames
  • Analyze ​Spark ​jobs ​using ​the ​UIs ​and ​logs
  • Create ​Streaming ​and ​Machine ​Learning ​jobs

Modules

  • Spark ​Overview
  • RDD ​Fundamentals
  • SparkSQL ​and ​DataFrames
  • Spark ​Job ​Execution
  • Cluster ​Architectures ​for ​Spark
  • Intro ​to ​Spark ​Streaming
  • Machine ​Learning ​Basics

Cost

​$2500 ​per ​person ​(except ​in ​the ​case ​of ​class ​cancellation, ​refunds ​will ​not ​be ​issued)

Requirements

All ​participants ​will ​need ​a ​laptop ​with ​updated ​versions ​of ​Chrome ​or ​Firefox ​(Internet ​Explorer ​and ​Safari ​are ​not ​supported) ​and ​an ​internet ​connection ​which ​can ​support ​use ​of ​GoToTraining. ​ ​GoToTraining ​will ​be ​the ​platform ​on ​which ​the ​class ​will ​be ​delivered. ​More ​information ​can ​be ​found ​at: https://support.logmeininc.com/gotomeeting/get-ready. ​ ​Prior ​to ​class, ​each ​registrant ​will ​receive ​GoToTraining ​log-in ​instructions.

Before ​registering, ​please ​confirm ​your ​computer ​can ​run ​GoToTraining ​at: https://support.logmeininc.com/gotomeeting/get-ready

About Databricks

Databricks’ vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 40,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a virtual analytics platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact info@databricks.com.

 

Contact Information

© 2018
Quick, easy and affordable online event registration and event management software for all event sizes.