Experience in Google Season of Docs 2019 with Apache Airflow
It turns out it was the inaugural phase. I read the details, and the process felt a lot like Google Summer of Code (GSoC) except that this was about documentation. About Me I have been writing tech articles on medium as well as my blog for the…
Here’s how Flink stores your State data
One of the significant features of Apache Flink is its ability to do stateful processing. The API to store and retrieve state data is simple which makes it a joy to use. However, behind that API lies a system to manage your data while providing persistence guarantees…
Here’s How You Can Go Beyond Http 1.1
Most of the business applications which are used today communicate with each other using HTTP 1.1 using REST. This has become a standard in the industry due to its sheer simplicity and almost no effort required to integrate the applications. However, once your application goes to the…
How to Run Apache Flink Effectively on YARN?
Apache Flink is a framework to write distributed realtime data processing applications in Java/Scala/Python. Uber, Netflix, Disney and other major companies use Flink for a variety of purposes. It was recently bought by Alibaba in a multi-million dollar deal. Flink is inspired by Google’s Dataflow model. According…
Developers, the Solution to Bootstrap Your Flink Jobs Is Finally here
Apache Flink is one of the most versatile data streaming open-source solution that exists. It supports all the primary functions of a typical batch processing system such as SQL, Connectors to Hive, Group By, etc. while providing fault-tolerance and exactly-once semantics. Hence, you can create a multitude…
Why Are Tesla and Google Designing Their Own Processors?
Recently, Elon Musk demonstrated Tesla’s Full Self-Driving Computer (FSD) which has been designed to process 2100 frames per second. The computer is designed to process video and audio through a Neural Net and output the commands which the vehicle should abide. The processor validates the instructions with…
What Makes Apache Flink Scale?
Apache Flink is a popular real-time data processing framework. It’s gaining more and more popularity thanks to its low-latency processing at extremely high throughputs in a fault-tolerant manner. I have been using Apache Flink in production for the last three years, and every time it has managed…
Realtime Data in Apache Druid — Choosing the Right Strategy
Storing data in real-time data streams has always been a challenge. The solution depends on your use cases. If you want to store data for daily or monthly analytics, you can use a distributed file system and run Hive or Presto on top of it. If you’re…
A Glimpse into the World of Embedded Database Feat. RocksDB
RocksDB is a database created by Facebook. It does not support SQL, doesn’t provide ACID guarantees, and can’t run in a distributed fashion. Still, it’s one of the most popular databases in the developer ecosystem. It is used in high-scale frameworks such as Apache Flink and CockroachDB….
A Better Guide to Building Apache Superset From source
In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it. First, you’ll need the following installed on your system Python 3.6 or 3.7 NodeJS NPM…
Software Developer | Technical Writer | Lives in Bangalore, IndiaLearn more
Data from Goodreads
Homo Deus: A History of Tomorrow
Yuval Noah Harari13 % (1 year ago)13 % (1 year ago)
Data from Goodreads
Thinking, Fast and Slow
Loonshots: How to Nurture the Crazy Ideas That Win Wars, Cure Diseases, and Transform Industries
Stress Test: Reflections on Financial Crises
Timothy F. Geithner