龙空技术网

大数据:Hive 性能方面的部分调优和排错

明少三年 100

前言:

今天小伙伴们对“linuxapache503”大体比较重视,看官们都需要知道一些“linuxapache503”的相关资讯。那么小编也在网络上汇集了一些有关“linuxapache503””的相关内容,希望兄弟们能喜欢,姐妹们一起来了解一下吧!

导语

Hive自身较关键的参数,特别是在生产环境需要进行调整。

认知过程

Hive 包含的2个主要服务组件 hiveserver2/hive metastore server 在网络请求连接数方面都需要进行调整。

Hive Metastore Server:

hms2

HiveServer2:

hs2

如上的 metastore 可以建立 10000 个连接,且每个连接都可以发出请求。hiveServer2 仅能建立 100 个连接,修改方法:

编辑 hive-core.xml

  <property>   <name>hive.server2.thrift.max.worker.threads</name>   <value>500</value>   <description>Maximum number of Thrift worker threads</description> </property>

这里修改为500,表示可以建立500内的连接,这些连接在处理请求的性能上受以下两个参数的控制:

hive.server2.async.exec.wait.queue.size 设置等待队列的长度,HiveServer2 收到请求后,先放到等待队列里。如果队列已满,则抛出异常。

hive.server2.async.exec.threads 设置计算线程的数量。计算线程从等待队列中取请求,进行处理。这些请求不一定都提交到集群上,如 show databases。

<property>   <name>hive.server2.async.exec.threads</name>   <value>100</value>   <description>Number of threads in the async thread pool for HiveServer2</description> </property> <property>   <name>hive.server2.async.exec.wait.queue.size</name>   <value>100</value>   <description>     Size of the wait queue for async thread pool in HiveServer2.     After hitting this limit, the async thread pool will reject new requests.   </description> </property>

如果在日志中发现如下的内容,可以调整上述的参数进行消除。

org.apache.hive.service.cli.HiveSQLException: The background threadpool cannot accept new task for execution, please retry the operation        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:340)        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:337)        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:439)        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:416)        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:282)        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:503)        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)        at java.lang.Thread.run(Thread.java:748)Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@46fa3cef rejected from java.util.concurrent.ThreadPoolExecutor@6b58ea47[Running, pool size = 100, active threads = 100, queued tasks = 100, completed tasks = 10427]        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)        at org.apache.hive.service.cli.session.SessionManager.submitBackgroundOperation(SessionManager.java:508)        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:335)        ... 14 more

errorlog

标签: #linuxapache503 #hive连接数设置