The pain of Avro schemas

Nick Lydon
2 min readApr 8, 2022

Recently I’ve been working with Apache Avro and I must say, the development experience leaves a lot to be desired. It’s very easy to make a mistake whilst writing the json and the avrogen tool will happily create the C# classes without giving any kind of warning. Producing a message will work because the schema is valid for encoding, only to see exceptions arise on the consumer-side.

Here are some of the issues I encountered:

  1. Setting logicalType at the same hierarchical level as the field. For example when creating a field in a record this is incorrect:
    {
    “name”: “FieldName”,
    “type”: “int”,
    “logicalType”: “date”
    }
    The correct way is to nest the logical type information inside the type property:
    {
    “name”: “FieldName”,
    “type”: {
    “type”: “int”,
    “logicalType”: “date”
    }
    }
    This also applies to unions, e.g.
    {
    “name”: “FieldName”,
    “type”: [
    “null”,
    {
    “type”: “int”,
    “logicalType”: “date”
    }
    ]
    }
  2. When specifying that a field is optional, two things must be done: the field declared as a union type with “null as the first type, and also a default value specified. You may think that the default value is unnecessary because there are no exceptions thrown when producing a message, but the issue manifests when consuming that message.
  3. For a default value of null, it must be the literal null and must not be the quoted string literal “null”.
  4. This one is dotnet-specific and has to do with the types that the code generator produces. The decimal logical type is backed by an array of bytes and requires that the scale and precision are specified (similar to a relational database column). The generated code uses an AvroDecimal, which can be implicitly converted from a dotnet decimal. The issue is that the scale needs to be exactly the same as specified in the schema, otherwise it throws an exception. This surprised me, as I would have expected it to truncate the digits until it fit within the specified scale. To work around it I had to use an alternative constructor where I could turn the decimal into the significand and specify the scale explicitly:
private static AvroDecimal ToAvroDecimal(decimal value, int scale) {
return new AvroDecimal(new BigInteger(value * (decimal) BigInteger.Pow(new BigInteger(10), scale)), scale);
}

--

--

Nick Lydon

British software developer working as a freelancer in Berlin. Mainly dotnet, but happy to try new things! https://github.com/NickLydon