oracle big data sql usage of kerberos tickets on exadata




How does Oracle Big Data SQL V2.0 use Kerberos Tickets on Exadata (Doc ID 2083982.1) To Bottom
In this Document

Goal

 

Solution

 

References

 

APPLIES TO:

Oracle Big Data SQL – Version 2.0 and later
Linux x86-64

GOAL

To review how Oracle Big Data SQL (BDS) uses Kerberos tickets on Exadata.

SOLUTION

In order for the Exadata to access BDS on a secured BDA it needs a ticket.

In more detail, Exadata provides query coordination. It needs to access HDFS to get splits (i.e. chunks of data to process) that are then allocated across the offload servers for scans.  Therefore, if  Exadata can not access HDFS, any query will fail. Similarly, it needs a Kerberos  ticket to access Hive, which provides the metadata that enables the HDFS file location lookup  for ORACLE_HIVE external table types.

Consider there are two clients to HDFS:
1. Exadata
2. Big Data SQL Servers running on BDA.

They both need valid Kerberos tickets for queries to execute efficiently.

1. If the BDS Servers do not have a valid ticket (or the query is quarantined), then the query originating from Exadata (if it has a ticket) will not fail.  Exadata will go directly to HDFS to retrieve the data. But this will be a very slow access.

2. Offload will fail if the Big Data SQL Servers do not have a valid ticket i.e. they are not trusted by HDFS.  When offload fails, the query can still return results if the database has a valid Kerberos ticket. In this case queries will not use the Big Data SQL Servers to get data as described in 1.

To summarize there are two types of “metadata” that Exadata is retrieving:

  • For data files, to retrieve the splits (e.g. block locations) that will be used to drive the execution of parallel scans across the cluster.
  • For Hive, information required to locate and parse the data.

Based on this: there are three possibilities regarding Exadata and Kerberos.

1. Both BDS on BDA and Exadata have a ticket.

In this case first the Exadata gets the metadata as a Hadoop client.  And then it can offload the query to the BDS Servers on BDA

2. Exadata has a ticket but BDS on BDA does not.

In this case the Exadata gets the metadata and data files as a normal Hadoop client.  But access will be slow.

3. Exadata does not have a ticket.

In this case nothing works whether BDS on BDA has the ticket or not, because Exadata can not discover how to query the data because it can not get the metadata.

 

 

Author: admin