Redpanda Highlights

What's Redpanda?

Redpanda, a Kafka® replacement for mission critical systems:

  • 10X Faster
  • Kafka® API Compatible - no code changes
  • Easy to use - No Zookeeper®, No JVM - built in C++. One binary.

Redpanda is a modern streaming data platform. It essentially is a drop-in replacement of Apache Kafka offering speed at reduced infrastructure and operational cost and complexity.

Let's break that down!

Streaming for all

Historically, a majority of open-source streaming technologies have been built around JVM technologies. Hadoop, Spark, and Flink are few examples which are reliable and scalable but they often require JVM expertise for maintenance and operation. For example, you had to be a seasoned Java developer to patch such a system. Also, operators had to learn how to tune JVM parameters for optimal performance of the system. This has been pushing away non-JVM developers from using streaming systems for a long time. Written in C++, Redpanda needs no JVM and thus users need no JVM knowledge or expertise.

Lower Operational Complexity

Redpanda implements services offered by Zookeeper, schema registry and the broker as a single Redpanda service. This obviously has resulted in lower operational complexity as there are lesser services to manage. Managed services could although take care of such complexity but it taxes you a lot, and not every organization is capable of doing that.

Besides Redpanda also comes with out-of-the-box auto-tuning and automatic leader and partition balancing which translates to even lesser management overhead. Optimal settings for specific hardware/kernel/Redpanda setup is generated automatically allowing developers to worry less about optimization and just write the applications they want, enhancing developer productivity.

Redpanda rids Zookeepers by leveraging the open source Raft consensus algorithm. Apparently, Kafka is also attempting to rid Zk. See this.

Kafka API compatible

Despite all its improvements under the hood, Redpanda maintains full API compatibility with Kafka that people have come to love. This means that existing Kafka-based systems can be swapped over to Redpanda with zero changes to applications, making for an easy and straight-forward migration path.

Thread-per-core Architecture

Written in lower-level C++, Redpanda’s thread-per-core architecture is optimized for modern hardware and squeezes out every last bit of performance, fully exploiting the resources it runs on. Since Redpanda fully saturates the underlying device, it operates with a much stable tail latencies. This translates to architects getting predictable performance in their applications and fewer unexpected spikes in latency.

Owning to this architectural approach Redpanda has demonstrated 10x or better performance over Kafka in benchmarks on same hardware.

Read more about it here.

Inline Transformations

Traditionally, even simple stateless data transformations have required data to be consumed from Kafka and transformed by a separate process say Spark Streaming, Kafka Streams, Apache Flink, etc. before the data being pushed back into Kafka or some other downstream systems like Elasticsearch. The problem with that is that your data is really being ping-ponging around your network all the time, even for simple things. Redpanda overcomes this by pushing computation to where the data is.

Redpanda is implementing this capability using WebAssembly (WASM) which would enable developers to write and edit code in their favorite programming language, compile it to WebAssembly on their machine, and then you ship the code to RedPanda and as data comes in, its transformed inline.

Read more about it here.

How does it compares to Pulsar?
Redpanda is a essentially a rewrite of Apache Kafka broker, protocol and the complete producer/consumer model. It aligns with the Kafka API completely. So conclusively they mustn't be compatible to Pulsar's subscription scheme. However, Redpanda claims to remove Kafka's partition count limit which constrains data modeling options, especially in multi-tenant environment which is something that aligns although not completely to the idea/implementation of multi-tenancy in Pulsar.

Also, Redpanda is attempting to implement Pulsar's tiered storage capability although not on top of Bookkeeper.

Alex Gallego, founder of Redpanda, shared following video when asked to explain one-to-one comparison between Pulsar and Redpanda. It touches the topic briefly though.
https://youtu.be/Z3OUhGXTzGc?t=2076

P.S. Redpanda team has offered for a one-to-one deeper technical overview if we are interested.

References
Show Comments