October 24, 2020

Apache Storm 2.2.0 releases: free and open source distributed realtime computation system

3 min read

Apache Storm is a free and open-source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language, and is a lot of fun to use!

Apache Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Apache Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant guarantees your data will be processed, and is easy to set up and operate.

Apache Storm integrates with the queueing and database technologies you already use. An Apache Storm topology consumes streams of data and processes those streams in arbitrarily complex ways, repartitioning the streams between each stage of the computation however needed.

Changelog v2.2

New Feature

  • [STORM-1293] – port backtype.storm.messaging.netty-integration-test to java
  • [STORM-1304] – port backtype.storm.submitter-test to java
  • [STORM-3259] – NUMA support for Storm
  • [STORM-3479] – HB timeout configurable on a topology level
  • [STORM-3480] – Implement One Executor Per Worker RAS Option
  • [STORM-3482] – Implement One Worker Per Component Option
  • [STORM-3492] – Adding configuration for blacklisting scheduler behavior
  • [STORM-3585] – Change ConstraintSolverStrategy to allow max co-Location Count for spreading components
  • [STORM-3627] – Allow use of shortNames for Metrics for worker in Metrics-V2
  • [STORM-3636] – Enable SSL credentials auto reload for storm UI, LogViewer and DRPC server

Improvement

  • [STORM-2749] – Remove state spout since it’s never supported by storm
  • [STORM-3066] – Storm Flux variable substitution
  • [STORM-3071] – change checkstyle plugin setting logViolationsToConsole to true
  • [STORM-3257] – ‘storm kill’ command line should be able to continue on error
  • [STORM-3434] – server: fix all checkstyle warnings
  • [STORM-3484] – Add Blacklisted Supervisors Info To UI
  • [STORM-3490] – Add checkstyle rule RedundantModifier
  • [STORM-3493] – Allow overriding python interpreter by environment variable
  • [STORM-3494] – Use UserGroupInformation to login to HDFS only once per process
  • [STORM-3507] – Need feedback from blacklisting to scheduling
  • [STORM-3509] – Improved RAS scheduling
  • [STORM-3529] – Catch and log RetriableException in KafkaOffsetMetric
  • [STORM-3530] – Improve Scheduling Failure Message
  • [STORM-3534] – Add generic resources to UI
  • [STORM-3536] – Add Generic-resources.md
  • [STORM-3538] – Add Meter for sendSupervisorAssignments exception
  • [STORM-3539] – Add metric for worker start time out
  • [STORM-3541] – allow reporting of v2 metrics api using metrics tick
  • [STORM-3543] – Avoid iterators for task hook info objects
  • [STORM-3545] – blob update spews errors until cleanup occurs after topology killed
  • [STORM-3548] – Remove iterator from Task.sendUnanchored
  • [STORM-3555] – Add meter for tracking errors killing workers
  • [STORM-3557] – allow health checks to pass on timeout
  • [STORM-3570] – add config name when validation fails with ClassNotFoundException
  • [STORM-3571] – Add topology info to slot warning messages
  • [STORM-3575] – Fix Scheduler Status on failure after multiple attempts
  • [STORM-3581] – Change log level to info to show the config classes being used for validation
  • [STORM-3584] – Support getting version info from a wildcard classpath entry
  • [STORM-3587] – Allow Scheduler futureTask to gracefully exit and register message on timeout
  • [STORM-3588] – RAS scheduler should not pre-empt and evict topologies due to generic resource
  • [STORM-3589] – Iterator in BaseResourceStrategy is potentially buggy
  • [STORM-3591] – Improve GRAS Strategy Log
  • [STORM-3594] – Add checkstyle rule WhitespaceAfter
  • [STORM-3596] – Feed send assignment status into blacklist scheduler
  • [STORM-3600] – ResourceAwareScheduler taking too long to schedule
  • [STORM-3604] – HealthChecker should print out error message when it fails
  • [STORM-3605] – add meter to track scheduling timeouts
  • [STORM-3614] – update SystemBolt metrics to use v2 API
  • [STORM-3616] – If running upload credentials and no autocreds are found, we should have an option to fail
  • [STORM-3618] – add meter for tracking internal scheduling errors
  • [STORM-3619] – Add null check for the topology name
  • [STORM-3625] – Storm CLI should validate topology name on client side
  • [STORM-3632] – Reduce SimpleSaslServerCallbackHandler supervisor logging
  • [STORM-3633] – Add message that supervisor is killing detached workers
  • [STORM-3634] – validate numa ports are contained in supervisor.slots.ports
  • [STORM-3640] – timed out health check processes should be killed

More

Download