Internally, Flink’s table runtime is a changelog processor. The concepts page describes how dynamic tables and streams relate to each other.
A StreamTableEnvironment
offers the following methods to expose these change data capture (CDC) functionalities:
fromChangelogStream(DataStream)
: Interprets a stream of changelog entries as a table. The stream record type must be org.apache.flink.types.Row
since its RowKind
flag is evaluated during runtime. Event-time and watermarks are not propagated by default. This method expects a changelog containing all kinds of changes (enumerated in org.apache.flink.types.RowKind
) as the default ChangelogMode
.
fromChangelogStream(DataStream, Schema)
: Allows to define a schema for the DataStream
similar to fromDataStream(DataStream, Schema)
. Otherwise the semantics are equal to fromChangelogStream(DataStream)
.
fromChangelogStream(DataStream, Schema, ChangelogMode)
: Gives full control about how to interpret a stream as a changelog. The passed ChangelogMode
helps the planner to distinguish between insert-only, upsert, or retract behavior.
toChangelogStream(Table)
: Reverse operation of fromChangelogStream(DataStream)
. It produces a stream with instances of org.apache.flink.types.Row
and sets the RowKind
flag for every record at runtime. All kinds of updating tables are supported by this method. If the input table contains a single rowtime column, it will be propagated into a stream record’s timestamp. Watermarks will be propagated as well.
toChangelogStream(Table, Schema)
: Reverse operation of fromChangelogStream(DataStream, Schema)
. The method can enrich the produced column data types. The planner might insert implicit casts if necessary. It is possible to write out the rowtime as a metadata column.
toChangelogStream(Table, Schema, ChangelogMode)
: Gives full control about how to convert a table to a changelog stream. The passed ChangelogMode
helps the planner to distinguish between insert-only, upsert, or retract behavior.
From a Table API’s perspective, converting from and to DataStream API is similar to reading from or writing to a virtual table connector that has been defined using a CREATE TABLE
DDL in SQL.
Because fromChangelogStream
behaves similar to fromDataStream
, we recommend reading the previous section before continuing here.
This virtual connector also supports reading and writing the rowtime
metadata of the stream record.
The virtual table source implements SupportsSourceWatermark
.