下载相应包。然后放到linux 相关目录,然后配置环境变量,配置文件如下
vim ~/.bash_profile
# .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/.local/bin:$HOME/bin export PATH #java setting export JAVA_HOME=/home/handoop/app/jdk1.8.0_91 export PATH=$JAVA_HOME/bin:$PATH #scala setting export SCALA_HOME=/home/handoop/app/scala-2.11.8 export PATH=$SCALA_HOME/bin:$PATH #hadoop setting export HADOOP_HOME=/home/handoop/app/hadoop-2.6.0-cdh5.7.0 export PATH=$HADOOP_HOME/bin:$PATH #maven setting export MAVEN_HOME=/home/handoop/app/apache-maven-3.3.9 export PATH=$MAVEN_HOME/bin:$PATH #spak setting export SPARK_HOME=/home/handoop/app/spark-2.3.0-bin-2.6.0-cdh5.7.0 export PATH=$SPARK_HOME/bin:$PATH # pthon qidong export PYSPARK_PYTHON=python3 #build run .py code export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.6-src.zip
windows 下使用pyCharm 中使用spark
配置环境变量
1,Javahome:
JAVA_HOME:C:\Program Files\Java\jdk1.8.0_321 (java 必须是1.8及以上。我用1.5踩坑)
系统变量:
PATH:
C:\Program Files\Java\jdk1.8.0_321\bin
E:\WooPython\Python_Spark\hadoop-2.6.0\bin
HADOOP_HOME:E:\WooPython\Python_Spark\hadoop-2.6.0
pycharm 的Edit Configuration 设置Environment variables:
PYTHONUNBUFFERED=1;SPARK_HOME=E:\WooPython\Python_Spark\spark-2.3.0-bin-2.6.0-cdh5.7.0;PYTHONPATH=E:\WooPython\Python_Spark\spark-2.3.0-bin-2.6.0-cdh5.7.0\python
工具/File/Settings/Project Structure /Add Content Root
E:\WooPython\Python_Spark\spark-2.3.0-bin-2.6.0-cdh5.7.0\python\lib\py4j-0.10.6-src
E:\WooPython\Python_Spark\spark-2.3.0-bin-2.6.0-cdh5.7.0\python\lib\pyspark.zip