Apache Kudu 1.8.0 Released

Posted 26 Oct 2018 by Attila Bukor

The Apache Kudu team is happy to announce the release of Kudu 1.8.0!

The new release adds several new features and improvements, including the following:

Index Skip Scan Optimization in Kudu

Posted 26 Sep 2018 by Anupama Gupta

This summer I got the opportunity to intern with the Apache Kudu team at Cloudera. My project was to optimize the Kudu scan path by implementing a technique called index skip scan (a.k.a. scan-to-seek, see section 4.1 in [1]). I wanted to share my experience and the progress we’ve made so far on the approach.

Read full post...

Simplified Data Pipelines with Kudu

Posted 11 Sep 2018 by Mac Noland

I’ve been working with Hadoop now for over seven years and fortunately, or unfortunately, have run across a lot of structured data use cases. What we, at phData, have found is that end users are typically comfortable with tabular data and prefer to access their data in a structured manner using tables.

Read full post...

Getting Started with Kudu - an O'Reilly Title

Posted 06 Aug 2018 by Brock Noland

The following article by Brock Noland was reposted from the phData blog with their permission.

Five years ago, enabling Data Science and Advanced Analytics on the Hadoop platform was hard. Organizations required strong Software Engineering capabilities to successfully implement complex Lambda architectures or even simply implement continuous ingest. Updating or deleting data, were simply a nightmare. General Data Protection Regulation (GDPR) would have been an extreme challenge at that time.

Read full post...

Instrumentation in Apache Kudu

Posted 10 Jul 2018 by Todd Lipcon

Last week, the OpenTracing community invited me to their monthly Google Hangout meetup to give an informal talk on tracing and instrumentation in Apache Kudu.

While Kudu doesn’t currently support distributed tracing using OpenTracing, it does have quite a lot of other types of instrumentation, metrics, and diagnostics logging. The OpenTracing team was interested to hear about some of the approaches that Kudu has used, and so I gave a brief introduction to topics including:

Read full post...

Apache Kudu Blog

Apache Kudu 1.8.0 Released

Index Skip Scan Optimization in Kudu

Simplified Data Pipelines with Kudu

Getting Started with Kudu - an O'Reilly Title

Instrumentation in Apache Kudu

Recent posts