Schema Evolution
Avro supports seamless schema evolution, allowing you to add fields and change data types without impacting existing data. This flexibility is advantageous in environments where data structures frequently change.
Compact Binary Format
Avro uses a compact binary format for data serialization, leading to efficient storage and faster data transmission compared to text-based formats like JSON or XML.
Language Agnostic
Avro is designed to be language agnostic, with support for multiple programming languages, including Java, Python, C++, and more. This makes it easier to integrate with various systems.
No Code Generation Required
Unlike other serialization frameworks such as Protocol Buffers and Thrift, Avro does not require generating code from the schema, simplifying the development process.
Self Describing
Each Avro data file contains its schema, making the data self-describing. This helps maintain consistency between data producers and consumers.
A schema.json converter for easier ingestion (likely supporting Avro and Protobuf). - Source: dev.to / about 2 months ago
Security Aware Data Metadata Data schema formats such as Avro and Json currently lack built-in support for data sensitivity or security-aware metadata. Additionally, common formats like Parquet and Iceberg, while efficient for storing large datasets, don’t natively include security-aware metadata. At Jarrid, we are exploring various metadata formats to incorporate data sensitivity and security-aware attributes... - Source: dev.to / 7 months ago
Apache AVRO [1] is one but it has been largely replaced by Parquet [2] which is a hybrid row/columnar format [1] https://avro.apache.org/. - Source: Hacker News / over 1 year ago
The most common format for describing schema in this scenario is Apache Avro. - Source: dev.to / over 1 year ago
Other serialization alternatives have a schema validation option: e.g., Avro, Kryo and Protocol Buffers. Interestingly enough, gRPC uses Protobuf to offer RPC across distributed components:. - Source: dev.to / about 2 years ago
Apache Avro is a data serialization system, for more information visit Apache Avro. - Source: dev.to / over 2 years ago
Once things like JSON became more popular Apache Avro appeared. You can define Avro files which can then be generated into Python, Java C, Ruby, etc.. classes. Source: over 2 years ago
Avro, a data serialization system based on JSON schemas. - Source: dev.to / over 2 years ago
Supporting multiple versions of an event schema is a solved problem. Apache Avro with a published schema hash in a message header is one solution. https://avro.apache.org/. - Source: Hacker News / over 2 years ago
If binary format is OK, use Protocol Buffer or Avro . Note that in the case of binary formats, you need a schema to serialize/de-serialize your data. Therefore, you'd probably want a schema registry to store all past and present schemas for later usage. Source: almost 3 years ago
Do you have time to talk about our lord and saviour Apache Avro. Source: about 3 years ago
When serializing a value, we convert it to a different sequence of bytes. This sequence is often a human-readable string (all the bytes can be read and interpreted by humans as text), but not necessarily. The serialized format can be binary. Binary data (example: an image) is still bytes, but makes use of non-text characters, so it looks like gibberish in a text editor. Binary formats won't make sense unless... - Source: dev.to / over 3 years ago
Scott: It's like a very large row of Avro data that had everything you could possibly ever need. It was like 115 columns. Most things were null, and it became every data type you'd ever want. It's like, is it mobile? Look for mobile_. It's like, this is really crappy. I didn't know about, I guess, the hardships of data engineering at that point. Because this was the first time where I was like, okay, you're on the... - Source: dev.to / over 3 years ago
Hudi is designed around the notion of base file and delta log files that store updates/deltas to a given base file (called a file slice). Their formats are pluggable, with Parquet (columnar access) and HFile (indexed access) being the supported base file formats today. The delta logs encode data in Avro (row oriented) format for speedier logging (just like Kafka topics for e.g). Going forward, we plan to inline... - Source: dev.to / almost 4 years ago
Do you know an article comparing Apache Avro to other products?
Suggest a link to a post with product alternatives.
This is an informative page about Apache Avro. You can review and discuss the product here. The primary details have not been verified within the last quarter, and they might be outdated. If you think we are missing something, please use the means on this page to comment or suggest changes. All reviews and comments are highly encouranged and appreciated as they help everyone in the community to make an informed choice. Please always be kind and objective when evaluating a product and sharing your opinion.