小知识:「云原生」Apache Livy on k8s 讲解与实战操作

一、概述

Livy是一个提供Rest接口和spark集群交互的服务。它可以提交Spark Job或者Spark一段代码,同步或者异步的返回结果;也提供Sparkcontext的管理,通过Restful接口或RPC客户端库。Livy也简化了与Spark与应用服务的交互,这允许通过web/mobile与Spark的使用交互。

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-猿站网-插图

官网:https://livy.incubator.apache.org/GitHub地址:https://github.com/apache/incubator-livy关于Apache Livy更多介绍也可以参考我这篇文章:Spark开源REST服务——Apache Livy(Spark 客户端)

二、开始编排部署

1.部署包准备

这里也提供上面编译好的livy部署包,有需要的小伙伴可以自行下载:

链接:https://pan.baidu.com/s/1pPCbe0lUJ6ji8rvQYsVw9A?pwd=qn7i提取码:qn7i

1)构建镜像

Dockerfile

FROM myharbor.com/bigdata/centos:7.9.2009 RUN rm f /etc/localtime && ln sv /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo “Asia/Shanghai” > /etc/timezone RUN export LANG=zh_CN.UTF8 ### install tools RUN yum install y vim tar wget curl less telnet nettools lsof RUN groupadd –system –gid=9999 admin && useradd –system -m /home/admin –uid=9999 –gid=admin admin RUN mkdir p /opt/apache ADD apachelivy0.8.0incubatingSNAPSHOTbin.zip /opt/apache/ ENV LIVY_HOME=/opt/apache/apachelivy RUN ln s /opt/apache/apachelivy0.8.0incubatingSNAPSHOTbin $LIVY_HOME ADD hadoop3.3.2.tar.gz /opt/apache/ ENV HADOOP_HOME=/opt/apache/hadoop RUN ln s /opt/apache/hadoop3.3.2 $HADOOP_HOME ENV HADOOP_CONFIG_DIR=${HADOOP_HOME}/etc/hadoop ADD spark3.3.0binhadoop3.tar.gz /opt/apache/ ENV SPARK_HOME=/opt/apache/spark RUN ln s /opt/apache/spark3.3.0binhadoop3 $SPARK_HOME ENV PATH=${LIVY_HOME}/bin:${HADOOP_HOME}/bin:${SPARK_HOME}/bin:$PATH RUN chown R admin:admin /opt/apache WORKDIR $LIVY_HOME ENTRYPOINT ${LIVY_HOME}/bin/livyserver start;tail f ${LIVY_HOME}/logs/livyrootserver.out

【注意】hadoop包里的core-site.xml,hdfs-site.xml,yarn-site.xml

开始构建镜像

docker build t myharbor.com/bigdata/livy:0.8.0 . –no-cache ### 参数解释 # t:指定镜像名称 # . :当前目录Dockerfile # f:指定Dockerfile路径 # –no-cache:不缓存 # 推送到harbor docker push myharbor.com/bigdata/livy:0.8.0 2)创建livy chart模板 helm create livy 3)修改yaml编排

livy/values.yaml

replicaCount: 1 image: repository: myharbor.com/bigdata/livy pullPolicy: IfNotPresent # Overrides the image tag whose default is the chart appVersion. tag: “0.8.0” securityContext: runAsUser: 9999 runAsGroup: 9999 privileged: true service: type: NodePort port: 8998 nodePort: 31998

livy/templates/configmap.yaml

apiVersion: v1 kind: ConfigMap metadata: name: {{ include “livy.fullname” . }} labels: {{ include “livy.labels” . | nindent 4 }} data: livy.conf: |- livy.spark.master = yarn livy.spark.deploymode = client livy.environment = production livy.impersonation.enabled = true livy.server.csrf_protection.enabled = false livy.server.port = {{ .Values.service.port }} livy.server.session.timeout = 3600000 livy.server.recovery.mode = recovery livy.server.recovery.statestore = filesystem livy.server.recovery.statestore.url = /tmp/livy livy.repl.enablehivecontext = true livyenv.sh: |- export JAVA_HOME=/opt/apache/jdk1.8.0_212 export HADOOP_HOME=/opt/apache/hadoop export HADOOP_CONF_DIR=/opt/apache/hadoop/etc/hadoop export SPARK_HOME=/opt/apache/spark export SPARK_CONF_DIR=/opt/apache/spark/conf export LIVY_LOG_DIR=/opt/apache/livy/logs export LIVY_PID_DIR=/opt/apache/livy/piddir export LIVY_SERVER_JAVA_OPTS=“-Xmx512m” sparkblacklist.conf: |- spark.master spark.submit.deployMode # Disallow overriding the location of Spark cached jars. spark.yarn.jar spark.yarn.jars spark.yarn.archive # Dont allow users to override the RSC timeout. livy.rsc.server.idle-timeout

livy/templates/deployment.yaml

apiVersion: apps/v1 kind: Deployment metadata: name: {{ include “livy.fullname” . }} labels: {{ include “livy.labels” . | nindent 4 }} spec: {{ if not .Values.autoscaling.enabled }} replicas: {{ .Values.replicaCount }} {{ end }} selector: matchLabels: {{ include “livy.selectorLabels” . | nindent 6 }} template: metadata: {{ with .Values.podAnnotations }} annotations: {{ toYaml . | nindent 8 }} {{ end }} labels: {{ include “livy.selectorLabels” . | nindent 8 }} spec: {{ with .Values.imagePullSecrets }} imagePullSecrets: {{ toYaml . | nindent 8 }} {{ end }} serviceAccountName: {{ include “livy.serviceAccountName” . }} securityContext: {{ toYaml .Values.podSecurityContext | nindent 8 }} containers: name: {{ .Chart.Name }} securityContext: {{ toYaml .Values.securityContext | nindent 12 }} image: “{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}” imagePullPolicy: {{ .Values.image.pullPolicy }} ports: name: http containerPort: 8998 protocol: TCP livenessProbe: httpGet: path: / port: http readinessProbe: httpGet: path: / port: http resources: {{ toYaml .Values.resources | nindent 12 }} {{ with .Values.securityContext }} securityContext: runAsUser: {{ .runAsUser }} runAsGroup: {{ .runAsGroup }} privileged: {{ .privileged }} {{ end }} volumeMounts: name: {{ .Release.Name }}livyconf mountPath: /opt/apache/livy/conf/livy.conf subPath: livy.conf name: {{ .Release.Name }}livyenv mountPath: /opt/apache/livy/conf/livyenv.sh subPath: livyenv.sh name: {{ .Release.Name }}sparkblacklistconf mountPath: /opt/apache/livy/conf/sparkblacklist.conf subPath: sparkblacklist.conf {{ with .Values.nodeSelector }} nodeSelector: {{ toYaml . | nindent 8 }} {{ end }} {{ with .Values.affinity }} affinity: {{ toYaml . | nindent 8 }} {{ end }} {{ with .Values.tolerations }} tolerations: {{ toYaml . | nindent 8 }} {{ end }} volumes: name: {{ .Release.Name }}livyconf configMap: name: {{ include “livy.fullname” . }} name: {{ .Release.Name }}livyenv configMap: name: {{ include “livy.fullname” . }} name: {{ .Release.Name }}sparkblacklistconf configMap: name: {{ include “livy.fullname” . }} 4)开始部署 helm install livy ./livy n livy –create-namespace

NOTES

NOTES: 1. Get the application URL by running these commands: export NODE_PORT=$(kubectl get –namespace livy -o jsonpath=”{.spec.ports[0].nodePort}” services livy) export NODE_IP=$(kubectl get nodes –namespace livy -o jsonpath=”{.items[0].status.addresses[0].address}”) echo http://$NODE_IP:$NODE_PORT

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-1猿站网-插图

查看

kubectl get pods,svc n livy owide

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-2猿站网-插图

web地址:http://192.168.182.110:31998/ui

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-3猿站网-插图

5)测试验证 curl s XPOST d {“file”:”hdfs://myhdfs/tmp/spark-examples_2.12-3.3.0.jar”,”className”:”org.apache.spark.examples.SparkPi”,”name”:”SparkPi-test”} H “Content-Type: application/json” http://local168182110:31998/batches|python m json.tool

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-4猿站网-插图

%小知识:「云原生」Apache Livy on k8s 讲解与实战操作-5猿站网-插图

6)卸载 helm uninstall livy n livy

git地址:https://gitee.com/hadoop-bigdata/livy-on-k8s

原文地址:https://www.toutiao.com/article/7163292224204046887/

声明: 猿站网有关资源均来自网络搜集与网友提供,任何涉及商业盈利目的的均不得使用,否则产生的一切后果将由您自己承担! 本平台资源仅供个人学习交流、测试使用 所有内容请在下载后24小时内删除,制止非法恶意传播,不对任何下载或转载者造成的危害负任何法律责任!也请大家支持、购置正版! 。本站一律禁止以任何方式发布或转载任何违法的相关信息访客发现请向站长举报,会员发帖仅代表会员个人观点,并不代表本站赞同其观点和对其真实性负责。本网站的资源部分来源于网络,如有侵权烦请发送邮件至:2697268773@qq.com进行处理。
建站知识

小知识:站群服务器是什么意思?如何选择站群服务器?

2023-3-4 13:55:38

建站知识

小知识:什么是站群服务器?

2023-3-4 14:00:55

0 条回复 A文章作者 M管理员
    暂无讨论,说说你的看法吧
个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索