Impala in Action:Querying and mining big data

by Richard L Saltzer and Istvan Szegedi

0 ratings • 0 reviews • 0 shelved

Book cover for Impala in Action:Querying and mining big data

Shelve It

Bookhype may earn a small commission from qualifying purchases. Full disclosure.

Impala in Action:Querying and mining big data

by Richard L Saltzer and Istvan Szegedi

0 ratings • 0 reviews • 0 shelved

DESCRIPTION

Hadoop queries in Pig or Hive can be too slow for real-time data analysis. Impala, an ultra-speedy query engine from Cloudera, supercharges Hadoop by avoiding the typical Map-Reduce overhead and parallelizing queries so that they can run on multiple nodes. This is a big deal for big data, because with Impala, querying Hadoop takes seconds rather than minutes. Impala's dialect is close to standard SQL, and Impala seamlessly accesses HBase and HDFS (Hadoop Distributed File System), allowing considerable freedom in choice of data formats.

Impala in Action is a hands-on guide to querying Hadoop using Impala. It starts by comparing Impala to traditional databases and database services on Hadoop. Then it explains Impala's SQL dialect and the basics of data access. Next, it tackles data visualization tasks and provides techniques for securing Impala with Apache Sentry. The book also shows how to embed Impala queries in a Java client and how to connect to JDBC and ODBC clients. Advanced readers will appreciate the deep dive into Impala's architecture and the practical insights into the issues complicated configurations and complex queries can cause.

RETAIL SELLING POINTS

Design an accessible state of the art analytical platform

Dramatically improves the way data is analyzed

Learn how to truly make data driven decisions

AUDIENCE

No prior experience with Impala required. Knowledge of SQL and Hadoop basics is expected.

ABOUT THE TECHNOLOGY

A rapidly growing technology, Impala is an open source, scalable, distributed SQL query engine that is capable of analyzing terabytes, to petabytes of data. Impala's core architecture enables itself to scale linearly across hundreds to thousands of commodity machines.

This Edition
Other Editions

ISBN10 1617291986
ISBN13 9781617291982
Publish Date 28 June 2015
Publish Status Out of Print
Out of Print 5 March 2021
Publish Country GB
Imprint Pearson Education Limited

Format Paperback
Pages 250
Language English