This is a follow-up to my post from last year Apache Zeppelin on OSX – Ultra Quick Start but without building from source. Today I tested the latest version of Zeppelin (0.5.6) and, using their distributed binaries, was instantly able to launch Zeppelin and run both Scala and Python jobs on my Macbook. This was with zero configuration, […]
You are browsing archives for
Tag: scala
Spark Analysis of Global Place Names (GeoNames)
Spark Analysis on a Large File GeoNames.org has free gazetteer data by country or for the world, provided in tab-separated text files. In this post I show you how to do some simple analysis using DataFrames in Spark. As the global file is 280M compressed and 1.2G uncompressed. This size of file makes it difficult to […]
Common Zeppelin Errors
A few different errors have popped during my initiation into Apache Zeppelin, here are a few of them, summarised with workarounds if you need them. Tutorial Failure Due To Spark Versions Default Zeppelin comes with Spark 1.1 (though it may be updated by the time you read this). The current Zeppelin tutorial assumes Spark 1.3 or greater […]