Structured Streaming

Real-Time Mode (RTM)

Spark 4.1.1 introduces the first official support for Real-Time Mode in Structured Streaming, enabling continuous sub-second latency processing. For stateless workloads, p99 latencies can reach single-digit milliseconds.

Activation — no code changes required, just configuration:

Bash
Copy

Spark 4.1.1 RTM support matrix:

DimensionSupported in 4.1.1
Query typesStateless, single-stage
LanguageScala
SourcesKafka
SinksKafka, Foreach
OperatorsStateless ops, Unions, Broadcast Stream-Static Joins
Output modeUpdate
Target latencySub-second (p99 single-digit ms for stateless)

Arbitrary Stateful Processing V2

Enhances Structured Streaming with flexible custom stateful operations. Supports complex event processing, stateful ML models, and a State Data Source for reading key-value pairs from checkpoints — useful for debugging and testing streaming pipelines.

Example:

Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated