Prepare a pod for the Hive client. Create a new YAML file locally, for example, hive-client.yaml, and fill in the following content.
apiVersion: apps/v1
kind: Deployment
metadata:
name: hive-client
namespace: default
labels:
app: hive-client
spec:
replicas: 1
selector:
matchLabels:
app: hive-client
template:
metadata:
labels:
app: hive-client
spec:
volumes:
- name: hive-config
configMap:
name: hive-server2-context
defaultMode: 420
- name: hdfs-config
configMap:
name: hdfs-config
defaultMode: 420
- name: kerberos-config
configMap:
name: krb5-config
defaultMode: 420
- name: user-keytab
persistentVolumeClaim:
claimName: home-keytab-data-pvc
containers:
- name: hive-client
image: od-registry.linktimecloud.com/ltc-hms:3.1.3-1.17
command:
- tail
- '-f'
- /dev/null
env: []
resources:
limits:
cpu: '2'
memory: 512Mi
requests:
cpu: 100m
memory: 512Mi
volumeMounts:
- name: hive-config
readOnly: true
mountPath: /opt/hive/conf/hive-site.xml
subPath: hive-site.xml
- name: hdfs-config
readOnly: true
mountPath: /opt/hive/conf/core-site.xml
subPath: core-site.xml
- name: hdfs-config
readOnly: true
mountPath: /opt/hive/conf/hdfs-site.xml
subPath: hdfs-site.xml
- name: kerberos-config
readOnly: true
mountPath: /etc/krb5.conf
subPath: krb5.conf
- name: user-keytab
readOnly: true
mountPath: /keytab
imagePullPolicy: IfNotPresent
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
imagePullSecrets:
- name: devregistry
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- hive-client
topologyKey: kubernetes.io/hostname
Notes:
hdfs/hive-metastore/hive-server2
cluster is located.hive-server2-context
contains the Hive Server2 configuration including hive-site.xml
.hdfs-config
contains the HDFS configuration including core-site.xml
and hdfs-site.xml
.krb5-config
contains the KDC configuration including krb5.conf
.home-keytab-data-pvc
contains the user's keytab.If Kerberos is not enabled in the cluster, you can remove the volumes/volumeMounts
for KDC/keytab.
Execute kubectl apply -f hive-client.yaml
to create a Hive client pod.
For example, in the default namespace, execute the kubectl command to enter the Hive client pod.
kubectl exec -it hive-client -n default -- bash
Multiple instances of hive-server2 will register their service addresses to ZooKeeper, and clients can obtain a random address from ZooKeeper.
If Kerberos is enabled, first execute kinit, then connect with Beeline.
kinit -kt /keytab/user1/user1.keytab user1
beeline -u 'jdbc:hive2://zookeeper:2181/;serviceDiscoveryMode=zookeeper;zooKeeperNamespace=default_hiveserver2/server;principal=hive/_HOST@BDOS.CLOUD'
If Kerberos is not enabled, you can connect directly as the root user.
beeline -u 'jdbc:hive2://zookeeper:2181/;serviceDiscoveryMode=zookeeper;zooKeeperNamespace=default_hiveserver2/server' -n root
Note:The value to fill in for zooKeeperNamespace in the above Beeline connection string is
_hiveserver2/server. If the namespace is admin, then it should be replaced with zooKeeperNamespace = admin_hiveserver2/server
.
You can also connect to a specific instance of hive-server2.
With Kerberos enabled:
beeline -u 'jdbc:hive2://hive-server2-0.hive-server2:10000/;principal=hive/_HOST@BDOS.CLOUD'
# Without Kerberos enabled:
beeline -u 'jdbc:hive2://hive-server2-0.hive-server2:10000/' -n root
Configure project dependencies (hadoop-common and hive-jdbc) in the pom.xml file. The additional project dependencies for this example are as follows.
<dependencies>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.1.1</version>
</dependency>
</dependencies>
Write code to connect to HiveServer2 and manipulate Hive table data. The example code is as follows.
import java.sql.*;
public class App
{
private static String driverName = "org.apache.hive.jdbc.HiveDriver";
public static void main(String[] args) throws SQLException {
try {
Class.forName(driverName);
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
Connection con = DriverManager.getConnection(
"jdbc:hive2://hive-server2-0.hive-server2:10000", "root", "");
Statement stmt = con.createStatement();
String sql = "select * from sample_tbl limit 10";
ResultSet res = stmt.executeQuery(sql);
while (res.next()) {
System.out.println(res.getString(1) + "\t" + res.getString(2));
}
}
}