HPCC Systems is a big data platform that uses Apache Arrow for columnar in-memory storage and computation. It has been open-sourced by HPE and the source code repository is available on GitHub. HPCC stands for High-Performance Compute Cluster and uses a shared-nothing non-blocking architecture that can process data in real-time. It is built on top of Spark and Apache Hadoop which means it supports traditional MapReduce jobs.
Why HPCC System?
- HPCC is built with a shared-nothing architecture—every node or server can handle 100% of its system resources
- There are no single points of failure for HPCC
- It is 100% Apache compatible, meaning it supports MapReduce jobs and Sparks jobs from the same cluster
- It requires no system administrator to manage it
- HPCC provides a way for businesses that have only a limited IT support team to take advantage of big data
- HPCC can be used on-premise or in a public cloud
- HPCC can handle both structured and unstructured data
- HPCC is very fast for big data processing jobs
- HPCC scales rapidly to handle more data, requiring less expansion time than other big data systems
- It is easy to install HPCC on any Unix-like environment
- The performance of HPCC increases with the use of more nodes in a cluster