A few different errors have popped during my initiation into Apache Zeppelin, here are a few of them, summarised with workarounds if you need them.
Tutorial Failure Due To Spark Versions
Default Zeppelin comes with Spark 1.1 (though it may be updated by the time you read this). The current Zeppelin tutorial assumes Spark 1.3 or greater in one place.
If you’d rather not update to a newer Spark version (which you should really do), you can make a simple change to the tutorial instead, removing the use of the toDF() function.
The error you’d receive is:
:55: error: value toDF is not a member of org.apache.spark.rdd.RDD[Bank] possible cause: maybe a semicolon is missing before 'value toDF'? ).toDF() ^
val bank = bankText.map(...).toDF() bank.registerTempTable("bank")
val bank = bankText.map(...) bank.registerTempTable("bank")
Python with Spark Failure
%pyspark print "test" pyspark is not responding Traceback (most recent call last): File "/tmp/zeppelin_pyspark.py", line 20, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient ImportError: No module named py4j.java_gateway
Perhaps you think you just need the Py4J module for Python, that’s a start…
sudo pip install py4j
But you’ll still get an error because pyspark requires Spark 1.2 or higher:
pyspark is not responding
Python 1.1.1 is not supported
Download Spark 1.3, build or get binaries as desired. Then ALSO rebuild Zeppelin with the option -Pspark-1.3:
mvn clean install -DskipTests -Pspark-1.3
Restart Zeppelin, edit Interpreter settings for Spark and set the spark.home option to point to your specific Spark install.
Be sure to restart Zeppelin after, it may be necessary to restart it from the command line sometimes.
Build required sudo access to install bower
I have to confirm this again to get the error clearly recorded, but when I started trying to build, I got some error from bower:
Failed to run task: 'bower --allow-root install' failed
As the user account I was using was not set up for sudo access, I tried installing bower as root first, but that didn’t seem to help. Instead I gave my user sudo access and the problem went away. Yes, a bit of a hacky workaround, but I don’t think it will go unnoticed very long.
You’ve run Zeppelin via Scala and went to rebuild Zeppelin but get this vague error:
[ERROR] Failed to execute goal org.apache.rat:apache-rat-plugin:0.11:check (verify.rat) on project zeppelin: Too many files with unapproved license: 3 See RAT report in: /home/demo/Downloads/incubator-zeppelin-master/target/rat.txt -> [Help 1]
More or less, this is because you have files in the build folder that the build process doesn’t like. If you haven’t done anything else to break it, then it’s probably the data folder which the tutorial notebook downloaded for you during testing. Simply remove it, build, restart again.
rm -r data
- Geography + Data - July 15, 2021
- DIY Battery – Weekend Project – Aluminum + Bleach? - January 17, 2021
- It’s all about the ecosystem – build and nurture yours - May 1, 2020
- Learnings from TigerGraph and Expero webinar - April 1, 2020
- 4 Webinars This Week – GPU, 5G, graph analytics, cloud - March 30, 2020
- Diving into #NoSQL from the SQL Empire … - February 28, 2017
- VID: Solving Performance Problems on Hadoop - July 5, 2016
- Storing Zeppelin Notebooks in AWS S3 Buckets - June 7, 2016
- VirtualBox extension pack update on OS X - April 11, 2016
- Zeppelin Notebook Quick Start on OSX v0.5.6 - April 4, 2016
Thanks for the tip on the “Too many files with unapproved license” error. Just ran into that when trying to re-install Zeppelin with Spark 1.3. Tried your advice and it appears to be proceeding now! (at least it is past the point where I got that error)
Or you can re-run the mvn clean install with the skip rat parameter: “-Drat.skip=true”