September 22, 2020

Apache Flink 1.11.2 releases: framework and distributed processing engine

5 min read

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

Process Unbounded and Bounded Data

Any kind of data is produced as a stream of events. Credit card transactions, sensor measurements, machine logs, or user interactions on a website or mobile application, all of these data are generated as a stream.

Data can be processed as unbounded or bounded streams.

  1. Unbounded streams have a start but no defined end. They do not terminate and provide data as it is generated. Unbounded streams must be continuously processed, i.e., events must be promptly handled after they have been ingested. It is not possible to wait for all input data to arrive because the input is unbounded and will not be complete at any point in time. Processing unbounded data often requires that events are ingested in a specific order, such as the order in which events occurred, to be able to reason about result completeness.
  2. Bounded streams have a defined start and end. Bounded streams can be processed by ingesting all data before performing any computations. Ordered ingestion is not required to process bounded streams because a bounded data set can always be sorted. Processing of bounded streams is also known as batch processing.

Apache Flink excels at processing unbounded and bounded data sets. Precise control of time and state enable Flink’s runtime to run any kind of application on unbounded streams. Bounded streams are internally processed by algorithms and data structures that are specifically designed for fixed sized data sets, yielding excellent performance.

Changelog v1.11.2

Sub-task

  • [FLINK-16087] – Translate “Detecting Patterns” page of “Streaming Concepts” into Chinese
  • [FLINK-18264] – Translate the “External Resource Framework” page into Chinese
  • [FLINK-18628] – Invalid error message for overloaded methods with same parameter name
  • [FLINK-18801] – Add a “10 minutes to Table API” document under the “Python API” -> “User Guide” -> “Table API” section
  • [FLINK-18910] – Create the new document structure for Python documentation according to FLIP-133
  • [FLINK-18912] – Add a Table API tutorial link(linked to try-flink/python_table_api.md) under the “Python API” -> “GettingStart” -> “Tutorial” section
  • [FLINK-18913] – Add a “TableEnvironment” document under the “Python API” -> “User Guide” -> “Table API” section
  • [FLINK-18917] – Add a “Built-in Functions” link (linked to dev/table/functions/systemFunctions.md) under the “Python API” -> “User Guide” -> “Table API” section
  • [FLINK-19110] – Flatten current PyFlink documentation structure

Bug

  • [FLINK-14087] – throws java.lang.ArrayIndexOutOfBoundsException when emiting the data using RebalancePartitioner.
  • [FLINK-15467] – Should wait for the end of the source thread during the Task cancellation
  • [FLINK-16510] – Task manager safeguard shutdown may not be reliable
  • [FLINK-16827] – StreamExecTemporalSort should require a distribution trait in StreamExecTemporalSortRule
  • [FLINK-18081] – Fix broken links in “Kerberos Authentication Setup and Configuration” doc
  • [FLINK-18212] – Init lookup join failed when use udf on lookup table
  • [FLINK-18341] – Building Flink Walkthrough Table Java 0.1 COMPILATION ERROR
  • [FLINK-18421] – Elasticsearch (v6.3.1) sink end-to-end test instable
  • [FLINK-18468] – TaskExecutorITCase.testJobReExecutionAfterTaskExecutorTermination fails with DuplicateJobSubmissionException
  • [FLINK-18552] – Update migration tests in master to cover migration from release-1.11
  • [FLINK-18581] – Cannot find GC cleaner with java version previous jdk8u72(-b01)
  • [FLINK-18588] – hive ddl create table should support ‘if not exists’
  • [FLINK-18595] – Deadlock during job shutdown
  • [FLINK-18600] – Kerberized YARN per-job on Docker test failed to download JDK 8u251
  • [FLINK-18608] – CustomizedConvertRule#convertCast drops nullability
  • [FLINK-18612] – WordCount example failure when setting relative output path
  • [FLINK-18632] – RowData’s row kind do not assigned from input row data when sink code generate and physical type info is pojo type
  • [FLINK-18639] – Error messages from BashJavaUtils are eaten
  • [FLINK-18641] – “Failure to finalize checkpoint” error in MasterTriggerRestoreHook
  • [FLINK-18646] – Managed memory released check can block RPC thread
  • [FLINK-18650] – The description of dispatcher in Flink Architecture document is not accurate
  • [FLINK-18655] – Set failOnUnableToExtractRepoInfo to false for git-commit-id-plugin in module flink-runtime
  • [FLINK-18656] – Start Delay metric is always zero for unaligned checkpoints
  • [FLINK-18659] – FileNotFoundException when writing Hive orc tables
  • [FLINK-18663] – RestServerEndpoint may prevent server shutdown
  • [FLINK-18665] – Filesystem connector should use TableSchema exclude computed columns
  • [FLINK-18672] – Fix Scala code examples for UDF type inference annotations
  • [FLINK-18677] – ZooKeeperLeaderRetrievalService does not invalidate leader in case of SUSPENDED connection
  • [FLINK-18682] – Vector orc reader cannot read Hive 2.0.0 table
  • [FLINK-18697] – Adding flink-table-api-java-bridge_2.11 to a Flink job kills the IDE logging
  • [FLINK-18700] – Debezium-json format throws Exception when PG table’s IDENTITY config is not FULL
  • [FLINK-18705] – Debezium-JSON throws NPE when tombstone message is received
  • [FLINK-18708] – The links of the connector sql jar of Kafka 0.10 and 0.11 are extinct
  • [FLINK-18710] – ResourceProfileInfo is not serializable
  • [FLINK-18748] – Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints
  • [FLINK-18749] – Correct dependencies in Kubernetes pom
  • [FLINK-18750] – SqlValidatorException thrown when select from a view which contains a UDTF call
  • [FLINK-18769] – MiniBatch doesn’t work with FLIP-95 source
  • [FLINK-18821] – Netty client retry mechanism may cause PartitionRequestClientFactory#createPartitionRequestClient to wait infinitely
  • [FLINK-18832] – BoundedBlockingSubpartition does not work with StreamTask
  • [FLINK-18856] – CheckpointCoordinator ignores checkpointing.min-pause
  • [FLINK-18859] – ExecutionGraphNotEnoughResourceTest.testRestartWithSlotSharingAndNotEnoughResources failed with “Condition was not met in given timeout.”
  • [FLINK-18862] – Fix LISTAGG throws BinaryRawValueData cannot be cast to StringData exception in runtime
  • [FLINK-18867] – Generic table stored in Hive catalog is incompatible between 1.10 and 1.11
  • [FLINK-18900] – HiveCatalog should error out when listing partitions with an invalid spec
  • [FLINK-18902] – Cannot serve results of asynchronous REST operations in per-job mode
  • [FLINK-18941] – There are some typos in “Set up JobManager Memory”
  • [FLINK-18942] – HiveTableSink shouldn’t try to create BulkWriter factory when using MR writer
  • [FLINK-18956] – StreamTask.invoke should catch Throwable instead of Exception
  • [FLINK-18959] – Fail to archiveExecutionGraph because job is not finished when dispatcher close
  • [FLINK-18992] – Table API renameColumns method annotation error
  • [FLINK-18993] – Invoke sanityCheckTotalFlinkMemory method incorrectly in JobManagerFlinkMemoryUtils.java
  • [FLINK-18994] – There is one typo in “Set up TaskManager Memory”
  • [FLINK-19040] – SourceOperator is not closing SourceReader
  • [FLINK-19061] – HiveCatalog fails to get partition column stats if partition value contains special characters
  • [FLINK-19094] – Revise the description of watermark strategy in Flink Table document
  • [FLINK-19108] – Stop expanding the identifiers with scope aliased by the system with ‘EXPR$’ prefix
  • [FLINK-19109] – Split Reader eats chained periodic watermarks
  • [FLINK-19121] – Avoid accessing HDFS frequently in HiveBulkWriterFactory
  • [FLINK-19133] – User provided kafka partitioners are not initialized correctly
  • [FLINK-19148] – Table crashed in Flink Table API & SQL Docs
  • [FLINK-19166] – StreamingFileWriter should register Listener before the initialization of buckets

Improvement

  • [FLINK-16619] – Misleading SlotManagerImpl logging for slot reports of unknown task manager
  • [FLINK-17075] – Add task status reconciliation between TM and JM
  • [FLINK-17285] – Translate “Python Table API” page into Chinese
  • [FLINK-17503] – Make memory configuration logging more user-friendly
  • [FLINK-18598] – Add instructions for asynchronous execute in PyFlink doc
  • [FLINK-18618] – Docker e2e tests are failing on CI
  • [FLINK-18619] – Update training to use WatermarkStrategy
  • [FLINK-18635] – Typo in ‘concepts/timely stream processing’ part of the website
  • [FLINK-18643] – Migrate Jenkins jobs to ci-builds.apache.org
  • [FLINK-18644] – Remove obsolete doc for hive connector
  • [FLINK-18730] – Remove Beta tag from SQL Client docs
  • [FLINK-18772] – Hide submit job web ui elements when running in per-job/application mode
  • [FLINK-18793] – Fix Typo for api.common.eventtime.WatermarkStrategy Description
  • [FLINK-18797] – docs and examples use deprecated forms of keyBy
  • [FLINK-18816] – Correct API usage in Pyflink Dependency Management page
  • [FLINK-18831] – Improve the Python documentation about the operations in Table
  • [FLINK-18839] – Add documentation about how to use catalog in Python Table API
  • [FLINK-18847] – Add documentation about data types in Python Table API
  • [FLINK-18849] – Improve the code tabs of the Flink documents
  • [FLINK-18881] – Modify the Access Broken Link
  • [FLINK-19055] – MemoryManagerSharedResourcesTest contains three tests running extraordinary long
  • [FLINK-19105] – Table API Sample Code Error

Task

  • [FLINK-18666] – Update japicmp configuration for 1.11.1
  • [FLINK-18667] – Data Types documentation misunderstand users
  • [FLINK-18678] – Hive connector fails to create vector orc reader if user specifies incorrect hive version

Download