Apache Superset

Straight from their site.

Superset is fast, lightweight, intuitive, and loaded with options that make it easy for users of all skill sets to explore and visualize their data, from simple line charts to highly detailed geospatial charts.

This installation is on a Ubuntu 20.04 LTS, and it is not meant for production use but will get you up and running for a demo. As you go through this guide, there will be references to how you turn this into a production environment.

The install document from Superset is pretty good, a few things are missing, and some errors have crept up along the way, Apache Superset Install.

There is also an expectation that you understand the whole Python environment and virtual python environments. It is strange when you don’t actually come from these environments.

So let’s start. You should have a virgin installation of Ubuntu 20.04 LTS

It is bad practice to do this, but I find this stops some issues with software installation, especially when you are new to it, but for now:-

Elevate your user (controversial, I know)

sudo su
apt-get -y install build-essential libssl-dev libffi-dev python3-dev python3-pip libsasl2-dev libldap2-dev default-libmysqlclient-dev libpq-dev python-is-python3 python3.8-venv
mkdir -p /opt/superset
cd /opt/superset

Now for the creation of the virtual environment. If you want to read more about this, then Creation of a virtual environment

python3 -m venv venv 
. venv/bin/activate

Your command prompt should look something like this, note the (venv) when ever you are doing any configuration, you need to see this on the command prompt.

(venv) root@superset-02:/opt/superset#

You may get an error that the command can not be run, this will be down to python3.8-venv not being installed

apt get python3.8-venv

I have done this install many times and found that installing this stops some of the suprious messages about legacy systems etc., I am not an expert in python so not sure if this is strictly necessary, but I do like clean installs. (This I think is actually needed, I installed the mysqlclient using pip and get some errors, it did appear to install, but still some errors appeared)

pip3 install wheel

This is needed when you do a production install with mysql. See Apache Superset – Production as you need to install some bits first. Only do the install of mysql do not do any configuration.

pip3 install mysqlclient

If you want to pip3 install mariadb then there are some issues. So look here on how you get around this one Issue installing python Maridb connector

pip3 install apache-superset --no-warn-script-location

You may encounter some errors.

ERROR – 1

Successfully built apache-superset cron-descriptor func-timeout holidays pgsanity python-geohash wtforms-json pymeeus
ERROR: flask-caching 2.0.1 has requirement cachelib>=0.9.0, but you’ll have cachelib 0.4.1 which is incompatible.

When trying to resolve this issue with the

Okay, now I picked this suggestion upf from a site, I forgot to give credit, but when I find it again, I will tag it.

The main issue is when to run this, I have installed this many times now and I am still trying to figure out the correct order. Please note the error below when installing 2.0.3. I have included this for Google searching for this error.

pip3 install --upgrade pip --no-warn-script-location
pip3 install werkzeug==2.0.3 --no-warn-script-location
pip3 install --upgrade pip --no-warn-script-location
pip3 install wtforms==2.3.0 --no-warn-script-location
pip3 install --upgrade pyopenssl --no-warn-script-location
pip3 install psycopg2-binary pillow gunicorn gevent --no-warn-script-location

Note that these versions could change over time. The error occured below and need sometimes and upgrade, also please note that you can unistall a packages as well

e.g. pip3 install flask==2.0.3

This error happened

ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
flask 2.2.2 requires Werkzeug>=2.2.2, but you have werkzeug 2.0.3 which is incompatible.
flask-caching 2.0.1 requires cachelib>=0.9.0, but you have cachelib 0.4.1 which is incompatible.

So this is a question of uninstall and reinstall some bits

pip3 install flask==2.2.2 –no-warn-script-location

pip3 install werkzeug==2.2.2 –no-warn-script-location

Please note that this is very condidictory to the production install. If you follow the production install notes though, you will get this working Apache Superset – Production

Note these notes below are for people who might encounter some of the issues I have faced and have used Google to search for the errors.

ERROR – 2

Running setup.py install for python-geohash … error
error: subprocess-exited-with-error

× Running setup.py install for python-geohash did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.8
copying geohash.py -> build/lib.linux-x86_64-3.8
copying quadtree.py -> build/lib.linux-x86_64-3.8
copying jpgrid.py -> build/lib.linux-x86_64-3.8
copying jpiarea.py -> build/lib.linux-x86_64-3.8
running build_ext
building ‘_geohash’ extension
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/src
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DPYTHON_MODULE=1 -I/opt/superset/venv/include -I/usr/include/python3.8 -c src/geohash.cpp -o build/temp.linux-x86_64-3.8/src/geohash.o
src/geohash.cpp:538:10: fatal error: Python.h: No such file or directory
538 | #include
| ^~~~~~
compilation terminated.
error: command ‘x86_64-linux-gnu-gcc’ failed with exit status 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> python-geohash

pip3 install wheel
pip3 install python-geohash --no-warn-script-location

There was still an error when trying to install this. It turned out that the python3-dev had not been installed properly

apt install python3-dev

Once you have everything installed and there are no errors. Finish wil

pip3 install apache-superset --no-warn-script-location

Okay, now you have a decision to make. Continue with the Non Production instructions if you just want a play with Superset to see what it can do or move production to get this ready to go.


			

1 thought on “Apache Superset”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s