As of the CentOS 7, the default Python version remains python 2.7, and python3 is not available in base repositories. If you need to use python3 as part of Python Spark application , there are several ways to install python3 on CentOS.
Method 1) Build and Install Python3 from the Source
Step 1) Install the following packages:
Step 2) Then using yum-builddep, set up a necessary build environment for python3 and install missing dependencies.
Step 3) Download the latest python3 (e.g., python 3.5) from https://www.python.org/ftp/python/
Step 4) Build and install python3, the default installation directory is /usr/local. If you want to change this to some other directory, pass "--prefix=/alternative/path" parameter to configure before running make.
Step 5) This will install python3, pip3, setuptools as well as python3 libraries on your CentOS system. Check the installed python version to validate
Now your python3.5 installation is complete can configure Spark-2.1.0 to use to Python 3.5
Add the following config in $SPARK_HOME/conf/spark-env.sh
Open the pyspark shell to validate
Method 1) Build and Install Python3 from the Source
Step 1) Install the following packages:
yum install gcc python-devel yum-utils
Step 2) Then using yum-builddep, set up a necessary build environment for python3 and install missing dependencies.
yum-builddep python
Step 3) Download the latest python3 (e.g., python 3.5) from https://www.python.org/ftp/python/
wget https://www.python.org/ftp/python/3.5.0/Python-3.5.0.tgz
Step 4) Build and install python3, the default installation directory is /usr/local. If you want to change this to some other directory, pass "--prefix=/alternative/path" parameter to configure before running make.
tar xf Python-3.5.0.tgz cd Python-3.5.0 ./configure make sudo make install
Step 5) This will install python3, pip3, setuptools as well as python3 libraries on your CentOS system. Check the installed python version to validate
python3 --version Python 3.5.0
Now your python3.5 installation is complete can configure Spark-2.1.0 to use to Python 3.5
Add the following config in $SPARK_HOME/conf/spark-env.sh
export PYSPARK_PYTHON=/usr/local/bin/python3 export PYSPARK_DRIVER_PYTHON=python3
Open the pyspark shell to validate
/opt/mapr/spark/spark-2.1.0/bin/pyspark Python 3.5.0 (default, Oct 26 2017, 14:40:09) [GCC 4.4.7 20120313 (Red Hat 4.4.7-18)] on linux Type "help", "copyright", "credits" or "license" for more information. Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.1.0-mapr-1707 /_/ Using Python version 3.5.0 (default, Oct 26 2017 14:40:09) SparkSession available as 'spark'. >>>
No comments:
Post a Comment