Apache Spark Programming (Spark 105): 3 day Instructor Led Public Class (Austin, TX)

Austin, Texas
Wednesday, January 23, 2019
Databricks
Apache Spark Programming (Spark 105): 3 day Instructor Led Public Class (Austin, TX)
Wednesday, January 23, 2019 9:00 AM -
Friday, January 25, 2019 5:00 PM (Central Time)

WeWork
1 (210) 984-1983
600 Congress Avenue
Austin, Texas 78701
United States

Map and Directions
Overview


This ​three-day, ​onsite instructor-led course ​​will ​be ​delivered ​on: ​Weds, January 23 - Friday January 25, 9am - 5pm CST, with an hour break each day for lunch.


Location: Austin, TX


This ​course ​is ​designed ​for ​data ​engineers, ​analysts, ​architects; ​software ​engineers; ​IT ​operations; ​and ​technical ​managers ​interested ​in ​a ​thorough, ​hands-on ​overview ​of ​Apache ​Spark. ​ ​This ​course ​covers ​the ​same ​material ​as ​our ​three-day ​Apache ​Spark ​Programming ​course. 

The ​course ​covers ​the ​core ​APIs ​for ​using ​Spark, ​fundamental ​mechanisms ​and ​basic ​internals ​of ​the ​framework, ​SQL ​and ​other ​high-level ​data ​access ​tools, ​as ​well ​as ​Spark’s ​streaming ​capabilities ​and ​machine ​learning ​APIs. 

Each ​topic ​includes ​slide ​and ​lecture ​content ​along ​with ​hands-on ​use ​of ​Spark ​through ​an ​elegant ​web-based ​notebook ​environment. ​Inspired ​by ​tools ​like ​IPython/Jupyter, ​notebooks ​allow ​attendees ​to ​code ​jobs, ​data ​analysis ​queries, ​and ​visualizations ​using ​their ​own ​Spark ​cluster, ​accessed ​through ​a ​web ​browser. ​All ​class ​code ​is ​directly ​usable ​with ​pure ​open-source ​Spark ​or ​any ​commercial ​Spark ​distribution. 

Objectives ​- ​after ​taking ​this ​class ​you ​will ​be ​able ​to: 

Describe ​Spark’s ​fundamental ​mechanics 
Use ​the ​core ​Spark ​APIs ​to ​operate ​on ​data 
Articulate ​and ​implement ​typical ​use ​cases ​for ​Spark 
Build ​data ​pipelines ​with ​SparkSQL ​and ​DataFrames 
Analyze ​Spark ​jobs ​using ​the ​UIs ​and ​logs 
Create ​Streaming ​and ​Machine ​Learning ​jobs 

Modules 

Spark ​Overview 
RDD ​Fundamentals 
SparkSQL ​and ​DataFrames 
Spark ​Job ​Execution 
Cluster ​Architectures ​for ​Spark 
Intro ​to ​Spark ​Streaming 
Machine ​Learning ​Basics 

Cost: ​$2500 ​per ​person 

Requirements 

All ​participants ​will ​need ​a ​laptop ​with ​updated ​versions ​of ​Chrome ​or ​Firefox ​(Internet ​Explorer ​and ​Safari ​are ​not ​supported) ​

About ​Databricks: 

Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Databricks’ founders started the Spark research project at UC Berkeley that later became Apache Spark. Databricks provides a Unified Analytics Platform powered by Apache Spark for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP. For more information, visit www.databricks.com.

Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.

 

Contact Information

© 2018
Quick, easy and affordable online event registration and event management software for all event sizes.