'jdbc:hive2://'+host+':' +port+'/'+database+' AuthMech=1 KrbHostFQDN='+host+' KrbServiceName=hive' # this is the driver for my environment (jdbc3, hive2, cloudera enterprise)ĭriver='3.HS2Driver' # note: your driver will depend on your environment and drivers you've Jar_files = glob.glob('/path/to/jar/files/*.jar') # Creates a list of jar files in the /path/to/jar/files/ directory You are now ready to make the connection via python: import jaydebeapi Here is more information Kerberos authentication and options.Ĭreate a Kerberos ticket if one is not already created $ kinit I will explain Kerberos solution without passing a username/password. Note that your jdbc connection URL will depend on the authentication mechanism you are using. In the pyhive solutions listed I've seen PLAIN listed as the authentication mechanism as well as Kerberos. Step 3: Identify your systems authentication mechanism: I will refer to this directory as /path/to/jar/files/. Another post that talks about where to find jdbc drivers for Apache Hive.Here is a link to the jars required for an enterprise CDH environment.Step 2: Download appropriate drivers for your environment: Step 1: Install JayDeBeApi pip install jaydebeap In my experience installing this one extra package on top of a python Anaconda 2.7 install was all I needed. Specifically, this solution focuses on leveraging the python package: JayDeBeApi. Install the package libsasl2-dev using apt-get or yum or whatever package manager for your distribution. If you're on Linux, you may need to install SASL separately before running the above. I am working on a linux environment that I do not have root access to so installing the SASL dependencies as mentioned in Tristin's post was not an option for me: Here is an alternative solution specifically for hive2 that does not require PyHive or installing system-wide packages. Similar to eycheu's solution, but a little more detailed. Remember to change the permission of the executable In addition to the standard python program, a few libraries need to be installed to allow Python to build the connection to the Hadoop databae.ġ.Pyhs2, Python Hive Server 2 Client Driverģ.Thrift, Python bindings for the Apache Thrift RPC system Statement = "select * from user_yuti.Temp_CredCard where pir_post_dt = '' limit 100" One new example is here: import pyhs2 as hiveĬonnection = nnect(host=DEFAULT_SERVER, port= DEFAULT_PORT, authMechanism='LDAP', user=u + + DEFAULT_DOMAIN, password=s) The examples above are a bit out of date. or to use the connection to make a Pandas dataframe: import pandas as pdĭf = pd.read_sql("SELECT cool_stuff FROM hive_table", conn) You can just straight-up query: cursor = conn.cursor()Ĭursor.execute("SELECT cool_stuff FROM hive_table") Now that you have the hive connection, you have options how to use it. On a Mac SASL should be available if you've installed xcode developer tools ( xcode-select -install in Terminal)Īfter installation, you can connect to Hive like this: from pyhive import hiveĬonn = hive.Connection(host="YOUR_HIVE_HOST", port=PORT, username="YOU") For Windows there are some options on GNU.org, you can download a binary installer. Please note that although you install the library as PyHive, you import the module as pyhive, all lower-case. To install you'll need these libraries: pip install sasl I believe the easiest way is to use PyHive. : Invalid method name: 'execute'īut inspecting the ThriftHive.py file reveals the method execute within the Client class. I received the following error: Traceback (most recent call last):įile "/usr/lib/hive/lib/py/hive_service/ThriftHive.py", line 68, in executeįile "/usr/lib/hive/lib/py/hive_service/ThriftHive.py", line 84, in recv_execute Protocol = TBinaryProtocol.TBinaryProtocol(transport)Ĭlient.execute("CREATE TABLE test(c1 int)") Transport = TTransport.TBufferedTransport(transport) So I set it up like so: from thrift import Thriftįrom thrift.protocol import TBinaryProtocol The default Hive Thrift port is 9083, which stopped the hanging. Next the port in the example was 10000, which when I tried caused the program to hang. I can then do the imports as listed in the link, with the exception of from hive import ThriftHive which actually need to be: from hive_service import ThriftHive It makes playing games with friends easier.When I add this to /etc/profile: export PYTHONPATH=$PYTHONPATH:/usr/lib/hive/lib/py joins a game lobby), everyone in the party gets pulled with them. Every party has a party owner and being in a party with other players means that wherever the party leader goes (e.g. The party system is a system put in place on the Hive that lets you group up with other players in a "party".
0 Comments
Leave a Reply. |