EMC logo
Data streams capture both spatial and temporal properties of real-world events. Such events can be indistinctly generated by end-users when they interact with social media, shop online, and search the web, or by machines such as computers being monitored, sensors producing measurement samples, or self-driven vehicles reporting their activity. Given all the potential sources of such event streams and the value in the data, it is highly desirable to offer efficient solutions to ingest, store, and process such streams. Technologies in this space find consumers in industries that range from social media to connected cars and the Internet of Things.

Pravega is a storage system designed and built from ground up to store unbounded amounts of stream data permanently and is novel in a number of ways. Stream is a first-class abstraction in Pravega and the API it exposes is inspired by existing messaging systems, which results in a familiar experience to developers. Pravega provides strong append semantics via transactional writes and by avoiding duplicates upon reconnection. As the workload of applications varies, Pravega streams accommodate workload changes by scaling streams up and down. Stream processors read from and store data in Pravega through connectors. The development of connectors is in progress, e.g., for Apache Flink and Apache Hadoop, and we expect many more to be implemented over time.

Pravega is an open-source project under active development. Building a strong and thriving community is a primary goal of the project.

Flavio Junqueira leads the Pravega team at Dell EMC. He holds a PhD in computer science from the University of California, San Diego and is interested in various aspects of distributed systems, including distributed algorithms, concurrency, and scalability. Previously, Flavio held a software engineer position with Confluent and research positions with Yahoo! Research and Microsoft Research. Flavio has contributed to a few important open-source projects. Most of his current contributions are to the Pravega open-source project, and previously he contributed and started Apache projects such as Apache ZooKeeper and Apache BookKeeper. Flavio coauthored the O’Reilly ZooKeeper: Distributed process coordination book.

Read more about Pravega at https://pravega.io
Join the {code} Community at https://thecodeteam.com/community
Reach out to Flavio at https://twitter.com/fpjunqueira

{code} Webinar - Pravega: Rethinking storage for streams

Source: Dell Blog