Search
Close this search box.

Apache Avro

Apache Avro

Apache Avro is a software created under the Hadoop project and is used for data serialization in order to facilitate and enable data exchange between different programs – even with different languages. Since the data is stored in a binary format in Avro, it is easy to access and interpret by another software – thereby making Avro a quick, efficient, and compact way to seamlessly exchange information. Furthermore, Avro is also capable of schema evolution so it can easily evolve over time with missing, changed, or newer fields.

Why Avro?

  • Usage of JSON for data definition makes the information easier to read by different programs
  • Storage of information in a binary manner that takes up less space and increases efficiency that automatically translates to lower cost per data storage
  • Contains application programming interface for various languages such as Java, Python, Ruby, etc.
  • As it relies heavily on schema evolution, it can help to interpret old data using new schema and new schema using old data so there is no chance of losing data or misinterpreting it