龙空技术网

ETL超强调度工具airflow——CentOS 7 安装airflow

IT知识小课堂 3404

前言:

现时你们对“centos69安装步骤”大致比较注重,兄弟们都需要了解一些“centos69安装步骤”的相关文章。那么小编同时在网摘上收集了一些对于“centos69安装步骤””的相关内容,希望看官们能喜欢,兄弟们快快来学习一下吧!

前面我们安装好了消息队列的redis,这次我们就正式安装airflow了。

该文是基于python虚拟化环境来安装,非虚拟化也是一样,虚拟化我只是不想破环系统环境。

安装python虚拟环境

pip install virtualenv

设置环境变量

sudo vi /etc/profile

将如下内容添加到末尾

export PYTHON_HOME=/usr/local/python3

export PATH=$PATH:$PYTHON_HOME/bin

source /etc/profile

创建虚拟环境存储文件夹

mkdir /softwares/pyenv_for_airflow

cd pyenv_for_airflow/

创建python虚拟环境

virtualenv --no-site-packages airflow_env

赋权

chmod +x -R *

激活虚拟环境

cd bin

source ./activate

安装依赖组件

yum -y install gcc zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel

yum -y install python-devel mysql-devel

yum -y install python3-devel

yum -y install cyrus-sasl cyrus-sasl-devel cyrus-sasl-lib

pip install paramiko

pip install pymysql

pip install sqlalchemy

vi /etc/profile

export AIRFLOW_HOME=/softwares/airflow

export SLUGIFY_USES_TEXT_UNIDECODE=yes

#即时生效

source /etc/profile

安装airflow,all全安装

pip install apache-airflow[all]

# 我选择全安装,因为我尝试过只是安装部分,有些功能就出现按bug。

初始化数据库

cd /softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin

./airflow initdb

查看其生成文件

cd /softwares/airflow/

创建mysql后台数据库

create database airflow_db default charset utf8 collate utf8_general_ci; create user 'airflow'@'%' identified by 'airflow_db';create user 'airflow'@'localhost' identified by 'airflow_db';grant all on airflow_db.* to 'airflow'@'%';flush privileges; -----------------------------------------utf8mb4字符的---------------------------------------------------------------create database airflow_db default charset utf8mb4 collate utf8mb4_unicode_ci; create user 'airflow'@'%' identified by 'airflow_db';create user 'airflow'@'localhost' identified by 'airflow_db';grant all on airflow_db.* to 'airflow'@'%';flush privileges;
·配置airflow使用LocalExecutor执行器,及使用MySQL数据库

vi airflow/airflow.cfg

executor = LocalExecutor

sql_alchemy_conn = mysql://root:123456@airflow.mn01:3306/airflow_db

[webserver]

base_url =

web_server_port = 8085

调整时区

default_timezone = Asia/Shanghai

还需要修改3个文件

#1、修改webserver页面上右上角展示的时间:

vi ${PYTHON_HOME}/lib/python3.7/site-packages/airflow/www/templates/admin/master.html

var UTCseconds = (x.getTime() + x.getTimezoneOffset()*60*1000);        $("#clock").clock({                "dateFormat":"Y-m-d ",                "timeFormat":"H:i:s %UTC%",                "timestamp":UTCseconds        }).click(function(){                alert('{{ hostname }}');        });        改为:var  UTCseconds = x.getTime();        $("#clock").clock({                "dateFormat":"Y-m-d ",                "timeFormat":"H:i:s",                "timestamp":UTCseconds        }).click(function(){                alert(

#2、修改airflow/utils/timezone.py

#在 utc = pendulum.timezone('UTC') 这行(第27行)代码下添加 from airflow import configuration as conftry:    tz = conf.get("core", "default_timezone")    if tz == "system":        utc = pendulum.local_timezone()    else:        utc = pendulum.timezone(tz)except Exception:    pass #修改utcnow()函数 (在第69行)#d = dt.datetime.utcnow()d = dt.datetime.now()

#3、修改airflow/utils/sqlalchemy.py

#在utc = pendulum.timezone('UTC') 这行(第37行)代码下添加 from airflow import configuration as conftry:    tz = conf.get("core", "default_timezone")    if tz == "system":        utc = pendulum.local_timezone()    else:        utc = pendulum.timezone(tz)except Exception:pass 
重新初始化数据库

./airflow initdb

启动服务

cd /softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin

./airflow webserver -D

可能错误

错误1:启动可能报错:FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn' ,找不到gunicorn。 airflow webserver启动时,会调用subprocess.Popen创建子进程,webserver使用gunicorn,启动参数:1: ['gunicorn', '-w', '4', '-k', 'sync', '-t', '120', '-b', '0.0.0.0:8080', '-n', 'airflow-webserver', '-p', '/home/admin/airflow/airflow-webserver.pid', '-c', 'airflow.;, '--access-logfile', '-', '--error-logfile', '-', 'airflow.;]执行gunicorn启动时,因为在PATH中找不到该命令报错。创建gunicorn软连接ln –fs /home/admin/python3.6/bin/gunicorn/bin/gunicorn /bin/gunicorn或者将/usr/local/python3/bin添加到PATH,export PATH=$PATH:/usr/local/python3/bin #即使生效source /etc/profile 错误2:有可能会启动不了,可以查看err日志,一般报错什么pid已经存在,这时候需要删除airflow目录下的airflow-webserver-monitor.pid文件

启动其它服务

./airflow scheduler -D

./airflow worker -D

#启动flower

./airflow flower-D

默认的端口为 5555,您可以在浏览器地址栏中输入 "; 来访问 flower ,对 celery 消息队列进行监控。

设置开机启动服务

#1、创建启动shell脚本

cd /softwares/

mkdir shellscripts

cd shellscripts/

touch startairflow.sh

vi startairflow.sh

#!/bin/bash# chkconfig: 2345 10 90# description:airflow开机自启脚本 #因为pid文件存在启动会报错,所以启动服务前先判定是否存在pid文件,存在删除先airflow_path="/softwares/airflow/"airflow_webserver_monitor_name="airflow-webserver-monitor.pid"airflow_webserver_pid_name="airflow-webserver.pid"airflow_scheduler_pid_name="airflow-scheduler.pid"airflow_worker_pid_name="airflow-worker.pid" if [ -x "$airflow_path" ]; then    echo "$airflow_path existed"    cd "$airflow_path"    if [ -f "$airflow_webserver_monitor_name" ]; then        echo "$airflow_webserver_monitor_name existed, i can delete it"        rm -rf "$airflow_webserver_monitor_name"    fi        if [ -f "$airflow_webserver_pid_name" ]; then        echo "$airflow_webserver_pid_name existed, i can delete it"        rm -rf "$airflow_webserver_pid_name"    fi        if [ -f "$airflow_scheduler_pid_name" ]; then        echo "$airflow_scheduler_pid_name existed, i can delete it"        rm -rf "$airflow_scheduler_pid_name"    fi        if [ -f "$airflow_worker_pid_name" ]; then        echo "$airflow_worker_pid_name existed, i can delete it"        rm -rf "$airflow_worker_pid_name"    fifi #进入python虚拟环境cd /softwares/pyenv_for_airflow/airflow_env/bin #激活虚拟环境source ./activate #启动相应的airflow 服务/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow webserver -D/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow scheduler -D#LocalExecutor模式不需要启动worker#/softwares/pyenv_for_airflow/airflow_env/lib/python3.7/site-packages/airflow/bin/airflow worker -D 

#2、将bash脚本cp到inti.d

sudo cp startairflow.sh /etc/init.d/startairflow

#3、加入到自启动中

#增加执行权限

cd /etc/init.d/

sudo chmod +x startairflow

#加入自动启动

sudo chkconfig startairflow on

#查看是否增加到自启动,2345为on即设置OK

chkconfig --list

· 将airflow命令加入PATH系统变量中,不需要每次指定到airflow bin目录下执行

sudo vi /etc/profile

#增加如下内容到末尾

export AIRFLOW_CLI_HOME=/usr/local/python3/lib/python3.7/site-packages/airflow/

export PATH=$PATH:$AIRFLOW_CLI_HOME/bin

#立即生效

source /etc/profile

标签: #centos69安装步骤