Skip to content

Commit

Permalink
Typed Parquet Annotations (#604)
Browse files Browse the repository at this point in the history
* Collecting parquet annotations and storing them in the map

Implemented SchemaUtil deepCopy with parquet annotations enrichment

Fix errors

Added more tests

Improve error message (#611)

Make implicit field from EnumType low priority (#613)

Fix build

* Improve parquet annotation support

* dded more checks into AvroSchemaComparer

* address review comments

* address review comments 2

* Addressed PR comments

Co-authored-by: Michel Davit <[email protected]>
  • Loading branch information
shnapz and RustedBones authored Oct 14, 2022
1 parent 0ba1a30 commit da3d3a1
Show file tree
Hide file tree
Showing 12 changed files with 938 additions and 435 deletions.
4 changes: 0 additions & 4 deletions avro/src/main/scala/magnolify/avro/AvroType.scala
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,6 @@ import scala.reflect.ClassTag
import scala.jdk.CollectionConverters._
import scala.collection.compat._

class doc(doc: String) extends StaticAnnotation with Serializable {
override def toString: String = doc
}

sealed trait AvroType[T] extends Converter[T, GenericRecord, GenericRecord] {
val schema: Schema
def apply(r: GenericRecord): T = from(r)
Expand Down
21 changes: 21 additions & 0 deletions avro/src/main/scala/magnolify/avro/package.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
/*
* Copyright 2022 Spotify AB
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package magnolify

package object avro {
type doc = shared.doc
}
6 changes: 6 additions & 0 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,12 @@ val commonSettings = Seq(
name = "Shameera Rathnayaka Yodage",
email = "[email protected]",
url = url("https://twitter.com/syodage")
),
Developer(
id = "shnapz",
name = "Andrew Kabas",
email = "[email protected]",
url = url("https://github.com/shnapz")
)
)
)
Expand Down
11 changes: 11 additions & 0 deletions docs/parquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,14 @@ implicit val pfDecimalBinary = ParquetField.decimalBinary(20, 0)
Among the date/time types, `DATE` maps to `java.time.LocalDate`. The other types, `TIME` and `TIMESTAMP`, map to `OffsetTime`/`LocalTime` and `Instant`/`LocalDateTime` with `isAdjustedToUTC` set to `true`/`false`. They can be in nano, micro, or milliseconds precision with `import magnolify.parquet.logical.{nanos,micros,millis}._`.

Note that Parquet's official Avro support maps `REPEATED` fields to an `array` field inside a nested group. Use `import magnolify.parquet.ParquetArray.AvroCompat._` to ensure compatibility with Avro.

The top level class and all fields (including nested class fields) can be annotated with `@doc` annotation. Note that nested classes annotations are ignored.

```scala
@doc("This is ignored")
case class NestedClass(@doc("nested field annotation") i: Int)

@doc("Top level annotation")
case class TopLevelType(@doc("field annotation") pd: NestedClass, @doc("field annotation 2") i:
Integers)
```
Loading

0 comments on commit da3d3a1

Please sign in to comment.