Apache Flink 1.12.2 releases: framework and distributed processing engine

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

Process Unbounded and Bounded Data

Any kind of data is produced as a stream of events. Credit card transactions, sensor measurements, machine logs, or user interactions on a website or mobile application, all of these data are generated as a stream.

Data can be processed as unbounded or bounded streams.

  1. Unbounded streams have a start but no defined end. They do not terminate and provide data as it is generated. Unbounded streams must be continuously processed, i.e., events must be promptly handled after they have been ingested. It is not possible to wait for all input data to arrive because the input is unbounded and will not be complete at any point in time. Processing unbounded data often requires that events are ingested in a specific order, such as the order in which events occurred, to be able to reason about result completeness.
  2. Bounded streams have a defined start and end. Bounded streams can be processed by ingesting all data before performing any computations. Ordered ingestion is not required to process bounded streams because a bounded data set can always be sorted. Processing of bounded streams is also known as batch processing.

Apache Flink excels at processing unbounded and bounded data sets. Precise control of time and state enable Flink’s runtime to run any kind of application on unbounded streams. Bounded streams are internally processed by algorithms and data structures that are specifically designed for fixed sized data sets, yielding excellent performance.

Changelog v1.12.2

Sub-task

  • [FLINK-21070] – Overloaded aggregate functions cause converter errors
  • [FLINK-21486] – Add sanity check when switching from Rocks to Heap timers

Bug

  • [FLINK-12461] – Document binary compatibility situation with Scala beyond 2.12.8
  • [FLINK-16443] – Fix wrong fix for user-code CheckpointExceptions
  • [FLINK-19771] – NullPointerException when accessing null array from postgres in JDBC Connector
  • [FLINK-20309] – UnalignedCheckpointTestBase.execute is failed
  • [FLINK-20462] – MailboxOperatorTest.testAvoidTaskStarvation
  • [FLINK-20500] – UpsertKafkaTableITCase.testTemporalJoin test failed
  • [FLINK-20565] – Fix typo in EXPLAIN Statements docs.
  • [FLINK-20580] – Missing null value handling for SerializedValue’s getByteArray()
  • [FLINK-20654] – Unaligned checkpoint recovery may lead to corrupted data stream
  • [FLINK-20663] – Managed memory may not be released in time when operators use managed memory frequently
  • [FLINK-20675] – Asynchronous checkpoint failure would not fail the job anymore
  • [FLINK-20680] – Fails to call var-arg function with no parameters
  • [FLINK-20798] – Using PVC as high-availability.storageDir could not work
  • [FLINK-20832] – Deliver bootstrap resouces ourselves for website and documentation
  • [FLINK-20848] – Kafka consumer ID is not specified correctly in new KafkaSource
  • [FLINK-20913] – Improve new HiveConf(jobConf, HiveConf.class)
  • [FLINK-20921] – Fix Date/Time/Timestamp in Python DataStream
  • [FLINK-20933] – Config Python Operator Use Managed Memory In Python DataStream
  • [FLINK-20942] – Digest of FLOAT literals throws UnsupportedOperationException
  • [FLINK-20944] – Launching in application mode requesting a ClusterIP rest service type results in an Exception
  • [FLINK-20947] – Idle source doesn’t work when pushing watermark into the source
  • [FLINK-20961] – Flink throws NullPointerException for tables created from DataStream with no assigned timestamps and watermarks
  • [FLINK-20992] – Checkpoint cleanup can kill JobMaster
  • [FLINK-20998] – flink-raw-1.12.jar does not exist
  • [FLINK-21009] – Can not disable certain options in Elasticsearch 7 connector
  • [FLINK-21013] – Blink planner does not ingest timestamp into StreamRecord
  • [FLINK-21024] – Dynamic properties get exposed to job’s main method if user parameters are passed
  • [FLINK-21028] – Streaming application didn’t stop properly
  • [FLINK-21030] – Broken job restart for job with disjoint graph
  • [FLINK-21059] – KafkaSourceEnumerator does not honor consumer properties
  • [FLINK-21069] – Configuration “parallelism.default” doesn’t take effect for TableEnvironment#explainSql
  • [FLINK-21071] – Snapshot branches running against flink-docker dev-master branch
  • [FLINK-21104] – UnalignedCheckpointITCase.execute failed with “IllegalStateException”
  • [FLINK-21132] – BoundedOneInput.endInput is called when taking synchronous savepoint
  • [FLINK-21138] – KvStateServerHandler is not invoked with user code classloader
  • [FLINK-21140] – Extract zip file dependencies before adding to PYTHONPATH
  • [FLINK-21144] – KubernetesResourceManagerDriver#tryResetPodCreationCoolDown causes fatal error
  • [FLINK-21155] – FileSourceTextLinesITCase.testBoundedTextFileSourceWithTaskManagerFailover does not pass
  • [FLINK-21158] – wrong jvm metaspace and overhead size show in taskmanager metric page
  • [FLINK-21163] – Python dependencies specified via CLI should not override the dependencies specified in configuration
  • [FLINK-21169] – Kafka flink-connector-base dependency should be scope compile
  • [FLINK-21208] – pyarrow exception when using window with pandas udaf
  • [FLINK-21213] – e2e test fail with ‘As task is already not running, no longer decline checkpoint’
  • [FLINK-21215] – Checkpoint was declined because one input stream is finished
  • [FLINK-21216] – StreamPandasConversionTests Fails
  • [FLINK-21225] – OverConvertRule does not consider distinct
  • [FLINK-21226] – Reintroduce TableColumn.of for backwards compatibility
  • [FLINK-21274] – At per-job mode, during the exit of the JobManager process, if ioExecutor exits at the end, the System.exit() method will not be executed.
  • [FLINK-21277] – SQLClientSchemaRegistryITCase fails to download testcontainers/ryuk:0.3.0
  • [FLINK-21312] – SavepointITCase.testStopSavepointWithBoundedInputConcurrently is unstable
  • [FLINK-21323] – Stop-with-savepoint is not supported by SourceOperatorStreamTask
  • [FLINK-21351] – Incremental checkpoint data would be lost once a non-stop savepoint completed
  • [FLINK-21361] – FlinkRelMdUniqueKeys matches on AbstractCatalogTable instead of CatalogTable
  • [FLINK-21412] – pyflink DataTypes.DECIMAL is not available
  • [FLINK-21452] – FLIP-27 sources cannot reliably downscale
  • [FLINK-21453] – BoundedOneInput.endInput is NOT called when doing stop with savepoint WITH drain
  • [FLINK-21490] – UnalignedCheckpointITCase fails on azure
  • [FLINK-21492] – ActiveResourceManager swallows exception stack trace

New Feature

  • [FLINK-20359] – Support adding Owner Reference to Job Manager in native kubernetes setup

Improvement

  • [FLINK-9844] – PackagedProgram does not close URLClassLoader
  • [FLINK-20417] – Handle “Too old resource version” exception in Kubernetes watch more gracefully
  • [FLINK-20491] – Support Broadcast Operation in BATCH execution mode
  • [FLINK-20517] – Support mixed keyed/non-keyed operations in BATCH execution mode
  • [FLINK-20770] – Incorrect description for config option kubernetes.rest-service.exposed.type
  • [FLINK-20907] – Table API documentation promotes deprecated syntax
  • [FLINK-21020] – Bump Jackson to 20.10.5[.1] / 2.12.1
  • [FLINK-21034] – Rework jemalloc switch to use an environment variable
  • [FLINK-21035] – Deduplicate copy_plugins_if_required calls
  • [FLINK-21036] – Consider removing automatic configuration fo number of slots from docker
  • [FLINK-21037] – Deduplicate configuration logic in docker entrypoint
  • [FLINK-21042] – Fix code example in “Aggregate Functions” section in Table UDF page
  • [FLINK-21048] – Refactor documentation related to switch memory allocator
  • [FLINK-21123] – Upgrade Beanutils 1.9.x to 1.9.4
  • [FLINK-21164] – Jar handlers don’t cleanup temporarily extracted jars
  • [FLINK-21210] – ApplicationClusterEntryPoints should explicitly close PackagedProgram
  • [FLINK-21381] – Kubernetes HA documentation does not state required service account and role

Task

  • [FLINK-20529] – Publish Dockerfiles for release 1.12.0
  • [FLINK-20534] – Add Flink 1.12 MigrationVersion
  • [FLINK-20536] – Update migration tests in master to cover migration from release-1.12
  • [FLINK-20960] – Add warning in 1.12 release notes about potential corrupt data stream with unaligned checkpoint
  • [FLINK-21358] – Missing snapshot version compatibility for 1.12

Download