找到你要的答案

Q:Oozie Java Action access to Hive Server 2(Kerberized) using delegation token

Q:Oozie java动作获得蜂巢服务器2(Kerberized)使用授权令牌

Currently I am having an issue really need some help. We are trying kerberize our hadoop cluster including hive server2 and oozie. My oozie job spins off a java action in data node which tries to connect to kerberized hive server 2. There is no user’s kerberos keytab for authentication. So I can only use delegation token passed by oozie in the java action to connect to hive server 2. My question is: is there any way that I can use delegation token in a oozie java action to connect to hive server 2? If so, how can I do it through hive JDBC? Thanks Jary

Currently I am having an issue really need some help. We are trying kerberize our hadoop cluster including hive server2 and oozie. My oozie job spins off a java action in data node which tries to connect to kerberized hive server 2. There is no user’s kerberos keytab for authentication. So I can only use delegation token passed by oozie in the java action to connect to hive server 2. My question is: is there any way that I can use delegation token in a oozie java action to connect to hive server 2? If so, how can I do it through hive JDBC? Thanks Jary

answer1: 回答1:

When using Oozie in a kerberized cluster...

  • for a "Hive" or "Pig" Action, you must configure <credentials> of type HCat
  • for a "Hive2" Action (just released with V4.2) you must configure <credentials> of type Hive2
  • for a "Java" action opening a custom JDBC connection to HiveServer2, I fear that Oozie cannot help -- unless there is an undocumented hack that would make it possible to reuse this new Hive2 credential?!?

Reference: Oozie documentation about Kerberos credentials

AFAIK you cannot use Hadoop delegation tokens with HiveServer2. HS2 uses Thrift for managing client connections, and Thrift supports Kerberos; but the Hadoop delegation tokens are something different (Kerberos was never intended for distributed computing, a workaround was needed)

What you can do is ship a full set of GSSAPI configuration, including a keytab, in your "Java" Action. It works, but there are a number of caveats:

  1. the Hadoop Auth library seems to be hard-wired on the local ticket cache in a very lame way; if you must connect to both HDFS and HiveServer2, then do HDFS first, because as soon as JDBC initiates its own ticket based on your custom conf, the Hadoop Auth will be broken
  2. Kerberos configuration is tricky, GSSAPI config is worse, and since these are security features the error messages are not very helpful, by design (would be bad taste to tell hackers why their intrusion attempt was rejected)
  3. use OpenJDK if possible; by default the Sun/Oracle JVM has limitations on cryptography (because of silly and obsolete US exports policies) so you must download 2 JARs with "unlimited strength" crypto settings to replace the default ones

Reference: another StackOverflow post that I found really helpful to set up "raw" Kerberos authentication when connecting to HiveServer2; plus a link about a very helpful "trace flag" for debugging your GSSAPI config e.g.

-Djava.security.debug=gssloginconfig,configfile,configparser,logincontext

Final warning: Kerberos is black magic. It will suck your soul away. More prosaically, it will have you lose many man-days to cryptic config issues, and team morale will suffer. We've been there.

当使用Oozie在基于Kerberos的集群…

  • for a "Hive" or "Pig" Action, you must configure <credentials> of type HCat
  • for a "Hive2" Action (just released with V4.2) you must configure <credentials> of type Hive2
  • for a "Java" action opening a custom JDBC connection to HiveServer2, I fear that Oozie cannot help -- unless there is an undocumented hack that would make it possible to reuse this new Hive2 credential?!?

参考:Oozie文献关于Kerberos凭据

据我所知你不能使用Hadoop授权令牌与HiveServer2。HS2采用节俭管理客户端的连接,并支持Kerberos的节俭;但Hadoop团令牌是不同的东西(Kerberos从未用于分布式计算,一种解决方法是必要的)

你可以做的是船舶全套GSSAPI配置,包括keytab,在你的“java”行动。它的作品,但也有一些需要注意的地方:

  1. the Hadoop Auth library seems to be hard-wired on the local ticket cache in a very lame way; if you must connect to both HDFS and HiveServer2, then do HDFS first, because as soon as JDBC initiates its own ticket based on your custom conf, the Hadoop Auth will be broken
  2. Kerberos configuration is tricky, GSSAPI config is worse, and since these are security features the error messages are not very helpful, by design (would be bad taste to tell hackers why their intrusion attempt was rejected)
  3. use OpenJDK if possible; by default the Sun/Oracle JVM has limitations on cryptography (because of silly and obsolete US exports policies) so you must download 2 JARs with "unlimited strength" crypto settings to replace the default ones

参考:另一个计算器后,我发现建立“原始”Kerberos身份验证连接到HiveServer2当真正有用的;加上一个环节的一个非常有用的“跟踪标志”为您的GSSAPI配置如调试

-Djava.security.debug=gssloginconfig,configfile,configparser,logincontext

最后警告:Kerberos的魔法。它会吸走你的灵魂。更简单地说,它会让你失去很多人天神秘的配置问题,与团队士气将遭受。我们一直在那里。

hive  token  oozie  delegation