Apache | Kafka
This guide walks you through connecting Kafka as a data source in ADOC, configuring observability, and enabling advanced features like concurrency control and freshness policies.
Prerequisites
Ensure the following before connecting Kafka with ADOC:
- A running Kafka cluster with accessible topics that can publish messages in a supported format such as JSON, Avro, or Confluent Avro.
- An active ADOC Data Plane with access to Kafka brokers
- Depending on your Kafka setup, you may need the following security credentials:
- Username and password (for SASL or Basic Auth)
- Certificate Authority (CA) or Server Certificate (SSL) certificates (keystore and truststore)
- Kerberos principal and keytab
Add Kafka as a Data Source
Follow these steps to set up HDFS in ADOC:
Step 1: Start Setup
Select Register from the left main menu.
Select Add Data Source.
Select Kafka from the list of data sources.
On the Data Source Details page:
- Enter a unique name for this data source.
- Optionally, add a brief description to clarify its purpose.
- Ensure the Data Reliability toggle is enabled and select your data plane from the drop-down list.
Select Next to proceed.
Step 2: Add Connection Details
Common Fields (Displayed for All Protocols)
Field | Description |
---|---|
Bootstrap Servers | Kafka broker address (e.g., hostname:port ). Required. |
Schema Registry Server | URL of the Schema Registry, if applicable. Optional. |
Schema Registry Authentication Type | If you provide a Schema Registry URL, you must also configure its authentication details (if required by your Kafka setup). Authentication Type: Select Basic Auth as the authentication type and provide the associated username and password. Username and Password: Enter the credentials used to access the schema registry. |
Protocol-Specific Fields
To understand each protocol, refer to the Apache Kafka security protocol documentation.
Security Protocol | Field | Description |
---|---|---|
Plain Text | – | No additional fields required. |
SSL | Use data plane for connection files | Toggle ON to fetch SSL files from data plane; OFF to upload manually. If Use data plane for connection files is enabled, the Keystore/Truststore file location fields will not appear. These files are fetched automatically from the Data Plane configuration. |
SSL Keystore FileLocation | Upload the Keystore file containing the client certificate. | |
SSL Keystore Password | Password to unlock the Keystore file. | |
SSL Truststore FileLocation | Upload the Truststore file used to validate the server certificate. | |
SSL Truststore Password | Password to unlock the Truststore file. | |
Kafka Additional Properties | Optional key-value pairs for advanced connection configuration. | |
SASL Plain Text | SASL Mechanism | Select the authentication mechanism (e.g., PLAIN , SCRAM-SHA-256 ). |
Kerberos Principal | Kerberos identity used to authenticate (e.g., kafka/node1.example.com@EXAMPLE.COM ). | |
Kerberos Service Name | Kafka service name used during Kerberos authentication (e.g., kafka ). Must match server-side principal. | |
Kerberos KeyTab File Location | Path to the KeyTab file containing credentials. Required for passwordless authentication. | |
Kafka Additional Properties | Optional key-value pairs, including authentication details. | |
SASL_SSL | SASL Mechanism | Select the SASL authentication mechanism. |
SSL Verification Required | Toggle ON to enforce SSL certificate validation. | |
Use data plane for connection files | Toggle ON to fetch SSL files from data plane; OFF to upload manually. | |
SSL Keystore FileLocation | Upload the Keystore file. | |
SSL Keystore Password | Password for the Keystore file. | |
SSL Truststore FileLocation | Upload the Truststore file. | |
SSL Truststore Password | Password for the Truststore file. | |
Kafka Additional Properties | Optional key-value configuration. |
- Select Test Connection. If successful, you’ll see “Connected.” If the test fails, ensure your bootstrap server is reachable, credentials are correct, and that the ADOC Data Plane service (
ad-analysis-standalone
) is running. - Select Next to proceed.
Step 3: Setup Observability
Configure how ADOC will monitor your Kafka topic:
- Asset Name – Descriptive name for this Kafka feed.
- Topic Name – Single topic or topic pattern.
- Topic Format – Choose
JSON
,Avro
, orConfluent Avro
. - Schema Registry URL – Required for Avro formats.
- (Optional) Enable Job Concurrency Control and set Max Slots to manage parallel profiling. For more information, see Control Plane Concurrent Connections and Queueing Mechanism
- Click Submit to finalize setup.
What’s Next
- Crawl data from the configured Kafka topic.
- Profile the data source to fetch metrics like message count, size, and freshness.
- Apply ADOC’s data reliability rules and policies to validate the quality of your Kafka topic.
Was this page helpful?