当前位置:首页 > 开发 > 开源软件 > 正文

【Spark八十一】Hive in the spark assembly

发表于: 2015-03-16   作者:bit1129   来源:转载   浏览:
摘要: Spark SQL supports most commonly used features of HiveQL. However, different HiveQL statements are executed in different manners: 1. DDL statements (e.g. CREATE TABLE, DROP TABLE, etc.)

Spark SQL supports most commonly used features of HiveQL. However, different HiveQL statements are executed in different manners:

  1. 1. DDL statements (e.g. CREATE TABLE, DROP TABLE, etc.) and commands (e.g. SET <key> = <value>, ADD FILE, ADD JAR, etc.)

    2. In most cases, Spark SQL simply delegates these statements to Hive, as they don’t need to issue any distributed jobs and don’t rely on the computation engine (Spark, MR, or Tez).

  2. SELECT queries, CREATE TABLE ... AS SELECT ... statements and insertions

    These statements are executed using Spark as the execution engine.

The Hive classes packaged in the assembly jar are used to provide entry points to Hive features, for example:

  1. 1. HiveQL parser
  2. 2. Talking to Hive metastore to execute DDL statements
  3. 3. Accessing UDF/UDAF/UDTF

As for the differences between Hive on Spark and Spark SQL’s Hive support, please refer to this article by Reynold: https://databricks.com/blog/2014/07/01/shark-spark-sql-hive-on-spark-and-the-future-of-sql-on-spark.html

【Spark八十一】Hive in the spark assembly

  • 0

    开心

    开心

  • 0

    板砖

    板砖

  • 0

    感动

    感动

  • 0

    有用

    有用

  • 0

    疑问

    疑问

  • 0

    难过

    难过

  • 0

    无聊

    无聊

  • 0

    震惊

    震惊

编辑推荐
Spark源码编译与环境搭建 Note that you must have a version of Spark which does not include the
基础环境: Apache Hadoop2.7.1 Centos6.5 Apache Hadoop2.7.1 Apache Hbase0.98.12 Apache Hive1.2
前置条件说明 Hive on Spark是Hive跑在Spark上,用的是Spark执行引擎,而不是MapReduce,和Hive on
前一篇文章是Spark SQL的入门篇Spark SQL初探,介绍了一些基础知识和API,但是离我们的日常使用还似
前置条件说明 Hive on Spark是Hive跑在Spark上,用的是Spark执行引擎,而不是MapReduce,和Hive on
最近不是很忙就写篇关于spark在实际中的应用 我目前带领团队给几家银行做数据分析相关工作,其中一
7 spark
环境: Hadoop版本:Apache Hadoop2.7.1 Spark版本:Apache Spark1.4.1 核心代码: 测试数据: Java
在用控制台学习hive和spark的时候,总是打印出来的各种日志烦得不行(对我而言)。所以就想把着写我
启动hiveserver2: hiveserver2 --hiveconf hive.execution.engine=spark spark.master=yarn 使用be
最近在使用 Spark 结合 Hive 来执行查询操作。。跑了一个demo 出现如下错误: 01-20 14:49:41 [INFO
版权所有 IT知识库 CopyRight © 2009-2015 IT知识库 IT610.com , All Rights Reserved. 京ICP备09083238号