This is a follow-up to my post from last year Apache Zeppelin on OSX – Ultra Quick Start but without building from source. Today I tested the latest version of Zeppelin (0.5.6) and, using their distributed binaries, was instantly able to launch Zeppelin and run both Scala and Python jobs on my Macbook. This was with zero configuration, […]
You are browsing archives for
Tag: python
Common Zeppelin Errors
A few different errors have popped during my initiation into Apache Zeppelin, here are a few of them, summarised with workarounds if you need them. Tutorial Failure Due To Spark Versions Default Zeppelin comes with Spark 1.1 (though it may be updated by the time you read this). The current Zeppelin tutorial assumes Spark 1.3 or greater […]
Python Spark SQL – Zeppelin Tutorial – No Scala
My latest notebook aims to mimic the original Scala-based Spark SQL tutorial with one that uses Python instead. Above you can see the two parallel translations side-by-side. Python Spark SQL Tutorial Code Here is the resulting Python data loading code. The SQL code is identical to the Tutorial notebook, so copy and paste if you need it. I […]
Apache Zeppelin on OSX – Ultra Quick Start
The Zeppelin project provides a powerful web-based notebook platform for data analysis and discovery. Behind the scenes it supports Spark distributed contexts as well as other language bindings on top of Spark. This post is a very simple introduction to show the first few steps to get started. You’ll find all you need to know […]
Kafka Consumer – Simple Python Script and Tips
[UPDATE: Check out the Kafka Web Console that allows you to manage topics and see traffic going through your topics – all in a browser!] When you’re pushing data into a Kafka topic, it’s always helpful to monitor the traffic using a simple Kafka consumer script. Here’s a simple script I’ve been using that […]
Kafka Topic Clearing after Producing Messages
[UPDATE: Check out the Kafka Web Console to more easily administer your Kafka topics] This week I’ve been working with the Kafka messaging system in a project. Basic C# Methods for Kafka Producer To publish to Kafka I built a C# app that uses the Kafka4n libraries – it doesn’t get much simpler than this: using Kafka.Client; Connector […]