Package org.apache.spark.sql.streaming
Class StatefulProcessorWithInitialState<K,I,O,S>
Object
org.apache.spark.sql.streaming.StatefulProcessor<K,I,O>
org.apache.spark.sql.streaming.StatefulProcessorWithInitialState<K,I,O,S>
- All Implemented Interfaces:
Serializable
Stateful processor with support for specifying initial state. Accepts a user-defined type as
initial state to be initialized in the first batch. This can be used for starting a new
streaming query with existing state from a previous streaming query.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.spark.sql.streaming.StatefulProcessor
StatefulProcessor.implicits, StatefulProcessor.implicits$
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract void
handleInitialState
(K key, S initialState, TimerValues timerValues) Function that will be invoked only in the first batch for users to process initial states.Methods inherited from class org.apache.spark.sql.streaming.StatefulProcessor
close, getHandle, handleExpiredTimer, handleInputRows, implicits, init, setHandle
-
Constructor Details
-
StatefulProcessorWithInitialState
public StatefulProcessorWithInitialState()
-
-
Method Details
-
handleInitialState
Function that will be invoked only in the first batch for users to process initial states. The provided initial state can be arbitrary dataframe with the same grouping key schema with the input rows, e.g. dataframe from data source reader of existing streaming query checkpoint.Note that in microbatch mode, this function can be called for one or more times per grouping key. If the grouping key is not seen within the initial state dataframe rows, then the function will not be invoked for that key.
- Parameters:
key
- \- grouping keyinitialState
- \- A row in the initial state to be processedtimerValues
- \- instance of TimerValues that provides access to current processing/event time if available
-