Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro 1.12.0 breaks jackson-dataformat-avro (also: requires JDK 11) #514

Open
rdifrango opened this issue Aug 30, 2024 · 30 comments
Open

Avro 1.12.0 breaks jackson-dataformat-avro (also: requires JDK 11) #514

rdifrango opened this issue Aug 30, 2024 · 30 comments

Comments

@rdifrango
Copy link

We were attempting to upgrade our project to Avro 1.12.0 with jackson-dataformat-avro version 2.17.2

The failure I get is as follows:

Exception in thread "main" java.lang.NoSuchMethodError: 'org.apache.avro.Schema$Parser org.apache.avro.Schema$Parser.setValidate(boolean)'
	at com.fasterxml.jackson.dataformat.avro.AvroMapper.schemaFrom(AvroMapper.java:240)
	at com.fmr.difrango.ztester.DemoJacksonAvroFailure.main(DemoJacksonAvroFailure.java:23)

To demonstrate this, I created the following example program that uses issue19.avsc as the schema to load.

package com.difrango.ztester;

import com.fasterxml.jackson.dataformat.avro.AvroMapper;
import com.fasterxml.jackson.datatype.jdk8.Jdk8Module;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;

public class DemoJacksonAvroFailure {
    private static final Logger LOGGER = LoggerFactory.getLogger(DemoJacksonAvroFailure.class);

    public static void main(String[] args) {
        var expectedStream =
                DemoJacksonAvroFailure.class.getClassLoader().getResourceAsStream("issue19.avsc");

        var mapper = new AvroMapper();
        mapper.registerModule(new Jdk8Module());
        mapper.registerModule(new JavaTimeModule());

        try {
            var expected = mapper.schemaFrom(expectedStream);
        } catch (IOException e) {
            LOGGER.error("Error Handling the Schema", e);
        }
    }
}
@cowtowncoder
Copy link
Member

Yes, see #167 -- you cannot upgrade Jackson 2.17 and below to Avro library past 1.8.2.

But upcoming 2.18.0 should allow upgrade.

@rdifrango
Copy link
Author

Thanks @cowtowncoder. It's interesting that we've been successfully using AVRO 1.11.2 so it must be some edge cases that we aren't currently hitting.

@cowtowncoder
Copy link
Member

@rdifrango Ah. I forgot that the issue wrt upgrade were mostly in unit tests, not on binary compatibility between module and codec library.

But I hope 2.18.0-rc1 -- being released today! -- will work with 1.12.x and above.

@cowtowncoder
Copy link
Member

2.18.0-rc1 now released: @rdifrango not sure if it's easy to do, but it'd be great to see if upgrade would work with 2.18, for eventual 2.18.0 release

@rdifrango
Copy link
Author

rdifrango commented Sep 3, 2024

2.18.0-rc1 now released: @rdifrango not sure if it's easy to do, but it'd be great to see if upgrade would work with 2.18, for eventual 2.18.0 release

I just tested it and I am seeing the same issue.

I just looked at AvroMapper and noticed that here it still has new Schema.Parser().setValidate(true) and it's the .setValidate(true) that was removed in 1.12.0

@cowtowncoder
Copy link
Member

@rdifrango rats. Ok. That may also explain why upgrade was to 1.11.3 not 1.12.0.

@rdifrango
Copy link
Author

@rdifrango rats. Ok. That may also explain why upgrade was to 1.11.3 not 1.12.0.

Quite possibly...and I'll grant that the move to 1.12.0 is not backwards compatible.

@cowtowncoder
Copy link
Member

It seems odd for Avro library to nominally follow SemVer but... seemingly not?
(although as a library maintainer I understand how difficult backwards-compatibility is).

But would be good to know why the method was removed & if it had been at least deprecated earlier. Plus whether there might be replacement to use, from older versions too.

@rdifrango
Copy link
Author

It seems odd for Avro library to nominally follow SemVer but... seemingly not? (although as a library maintainer I understand how difficult backwards-compatibility is).

But would be good to know why the method was removed & if it had been at least deprecated earlier. Plus whether there might be replacement to use, from older versions too.

I'd agree @cowtowncoder especially since I'm having a conversation with on this PR with the commons-validator team and they are being very strict.

I did look over on the avro side, but couldn't find any specific mention of why this was changed and it was unclear to me if it's now simply the default behavior of the parser or not.

Should we ask over there as well?

@cowtowncoder
Copy link
Member

@rdifrango I guess damage is done, API compatibility broken b/w versions. So not sure how much value there would be.
(not sure how commons-validator is related -- is that a dependency they added/removed?)

@rdifrango
Copy link
Author

@rdifrango I guess damage is done, API compatibility broken b/w versions. So not sure how much value there would be. (not sure how commons-validator is related -- is that a dependency they added/removed?)

agreed the damage is done, the commons-validator was more an indicator that other apache projects seem to hold to strict guidelines around binary and API compatibility which clearly Avro has not.

@cowtowncoder cowtowncoder changed the title Avro 1.12.0 breaks jackson-dataformat-avro Avro 1.12.0 breaks jackson-dataformat-avro (also: requires JDK 11) Sep 5, 2024
@cowtowncoder
Copy link
Member

One other reason why we probably cannot upgrade: looks like Avro 1.12.0 is built with JDK 11, and Jackson 2.18 still has Java 8 as the baseline.

@rdifrango
Copy link
Author

rdifrango commented Sep 5, 2024

One other reason why we probably cannot upgrade: looks like Avro 1.12.0 is built with JDK 11, and Jackson 2.18 still has Java 8 as the baseline.

That definitely makes this tricky, any thoughts on a path forward?

My concern is that if some sort of CVE is discovered in the AVRO project there currently isn't a path forward.

@cowtowncoder
Copy link
Member

@rdifrango No idea short-term. Things are tricky as it's multi-module Maven project so would need to at least build all with JDK 11; but then try to target JDK 8 (if even possible) for non-Avro backends.

But on very short term I think baseline code needs to remain 1.11.x.

Ideally we would let users override with newer version tho.

And I think it'd actually be possible to use matrix build for CI testing to try version override, if we get module itself run against newer library version -- but that's not solved yet; how to avoid/change parser.setValidate() call.

@cowtowncoder
Copy link
Member

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

@cowtowncoder
Copy link
Member

Ok, so Avro lib backwards-compatibility is really... not. There isn't any. 1.11.3 -> 1.11.4 breaks tests, as per #539. Patches can be as breaking as minor version or so.

@rdifrango
Copy link
Author

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

I'll try to take this on but it won't be until mid-January.

@cowtowncoder
Copy link
Member

I think upgrading master/3.0 first to 1.12.x makes most sense.

Patch 1.11.4 is something users can opt-in regardless of what Jackson 2.x module defaults to (that is, for versions that default to 1.11.3, 2.18.x and 2.19)

@rdifrango
Copy link
Author

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

I'll try to take this on but it won't be until mid-January.

ok, I started looking at this and it seems to break a few of the tests. The two groups in particular Jackson to Apache with Jackson schema and Apache to Apache with Jackson schema.

In the one test ListWithPrimitiveWrapperTest.testListWithShorts it's complaining about java.lang.ClassCastException: class java.lang.Short cannot be cast to class java.lang.Integer (java.lang.Short and java.lang.Integer are in module java.base of loader 'bootstrap')

I'm just trying to understand these tests before moving forward.

@rdifrango
Copy link
Author

and related, in ListWithPrimitiveWrapperTest.testListWithDoubles I'm getting the following in those same 2 tests:

org.junit.ComparisonFailure:
Expected :[1.0, 0.0, -1.0, 4.9E-324, 1.7976931348623157E308]
Actual :[1.0, 0.0, -1.0, 0.0, Infinity]

@cowtowncoder
Copy link
Member

@rdifrango Latter def sounds like Float used for Double (or vice versa), triggering over-/underflow.

@rdifrango
Copy link
Author

@rdifrango Latter def sounds like Float used for Double (or vice versa), triggering over-/underflow.

I'm seeing on the reading side that it's hitting the Double section of org.apache.avro.generic.GenericDatumReader, aka this:

case DOUBLE:
      return in.readDouble();

so I'm now looking at ObjectWriter to see how the bytes look and compare it to the other ones that do pass when using doubles.

@rdifrango
Copy link
Author

To your point, with Apache to Apache with Jackson schema or Jackson to Apache with Jackson schema the following do work fine:

  • Float
  • Integers
  • Strings
  • Longs

The one that fails with different values is:

  • Double

The ones not working at all are:

  • Shorts java.lang.ClassCastException: class java.lang.Short cannot be cast to class java.lang.Integer (java.lang.Short and java.lang.Integer are in module java.base of loader 'bootstrap')

  • Characters java.lang.ClassCastException: class java.lang.Character cannot be cast to class java.lang.Integer (java.lang.Character and java.lang.Integer are in module java.base of loader 'bootstrap')

  • Bytes - java.lang.ClassCastException: class java.lang.Byte cannot be cast to class java.lang.Integer (java.lang.Byte and java.lang.Integer are in module java.base of loader 'bootstrap')

@cowtowncoder
Copy link
Member

Sounds like some level of casting would be needed, somewhere. But not quite sure how/why breakage; I guess Avro Codec started using more accurate Java types, maybe, for its binding (or more generic?).

I know above is not super helpful, just thinking out aloud I guess.

@rdifrango
Copy link
Author

Sounds like some level of casting would be needed, somewhere. But not quite sure how/why breakage; I guess Avro Codec started using more accurate Java types, maybe, for its binding (or more generic?).

I know above is not super helpful, just thinking out aloud I guess.

Well, for the double case, I at least tracked it down to here and it's close to what you thought, they're converting the Double to a floatValue and when it's called it returns either 0 or Infinity which feels like a bug in Avro to me:

https://github.com/apache/avro/blob/main/lang/java/avro/src/main/java/org/apache/avro/generic/PrimitivesArrays.java#L549-L554

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 9, 2025

@rdifrango Thank you for tracking that down, yes, symptoms fit the problem. float range not big enough to represent MIN/MAX double, obviously, so gets "rounded"/truncated to 0 / Inf.

Could you file a bug (and maybe even PR) against Avro-lib, that definitely looks like a straight-forward bug, possibly copy-paste error?

This would solve just one problem case, I suppose, but ... one step at a time.

@rdifrango
Copy link
Author

https://issues.apache.org/jira/browse/AVRO-4110

@cowtowncoder
Copy link
Member

Thank you @rdifrango !

@rdifrango
Copy link
Author

Thank you @rdifrango !

No worries and I'm working on the PR to fix that.

I also think the following change is what caused this and is the one causing the failures with Short, Character, and Byte test as well:

apache/avro#2389, specifically this comment.

@rdifrango
Copy link
Author

@cowtowncoder, the first PR is in:

apache/avro#3292

Now to prove my theory on the PR that may have caused the other failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants