Configure Schema and Tables in a Kerberized Environment
This page shows how to configure Pinot schema and table files for ingesting data from a Kerberized Kafka source. It includes Kafka security settings, JAAS config, and the command to add the table using pinot-admin.sh
.
- Table Configuration File
This example shows a Pinot realtime table config (table-config-stream.json
) for ingesting data from a Kerberized Kafka source. It includes stream settings, Kafka broker details, and Kerberos authentication properties.
[root@odp3361103 himanshu]# cat table-config-stream.json
{
"tableName": "events_kerb",
"tableType": "REALTIME",
"segmentsConfig": {
"schemaName": "events_kerb",
"timeColumnName": "ts",
"retentionTimeUnit": "DAYS",
"retentionTimeValue": "2000",
"segmentPushType": "APPEND",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"replication": "2",
"replicasPerPartition": "2"
},
"tenants": {},
"tableIndexConfig": {
"loadMode": "MMAP",
"streamConfigs": {
"streamType": "kafka",
"stream.kafka.topic.name": "events_kerb",
"stream.kafka.broker.list": "odp3361103.acceldata.dvl:6667,odp3361101.acceldata.dvl:6667,odp3361102.acceldata.dvl:6667",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.time": "1h",
"realtime.segment.flush.threshold.segment.size": "50M",
"security.protocol": "SASL_PLAINTEXT",
"sasl.mechanism": "GSSAPI",
"sasl.kerberos.service.name": "kafka",
"sasl.jaas.config": "com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab=\"/etc/security/keytabs/pinot.headless.keytab\" storeKey=true useTicketCache=false serviceName=\"kafka\" principal=\"pinot-odp_green@ADSRE.COM\" doNotPrompt=true;"
}
},
"metadata": {
"customConfigs": {}
},
"routing": {
"instanceSelectorType": "strictReplicaGroup"
},
"query": {},
"quota": {}
}
[root@odp3361103 himanshu]# cat schema-stream.json
{
"schemaName": "events_kerb",
"dimensionFieldSpecs": [
{ "name": "uuid", "dataType": "STRING" }
],
"metricFieldSpecs": [
{ "name": "count", "dataType": "INT" }
],
"dateTimeFieldSpecs": [{
"name": "ts",
"dataType": "TIMESTAMP",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}]
}
- Ingestion Command Example
Use the following command to add the schema and table to Pinot in a Kerberized environment. It sets the JAAS config for secure Kafka access and uses pinot-admin.sh AddTable
for ingestion.
env JAVA_HOME=/usr/lib/jvm/java-11-openjdk KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/conf/kafka_client_jaas.conf" /usr/odp/3.3.6.2-1/pinot/bin/pinot-admin.sh AddTable
-schemaFile /tmp/himanshu/schema-stream.json
-tableConfigFile /tmp/himanshu/table-config-stream.json
-controllerHost 10.100.10.83
-controllerPort 9000
-exec
- Kafka Client Configuration Example
This example shows the JAAS config (kafka_client_jaas.conf
) used for authenticating the Kafka client in a Kerberized environment:
cat /etc/kafka/conf/kafka_client_jaas.conf
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
renewTicket=true
serviceName="kafka";
};