2015년 1월 1일 목요일

[cascading-user] Printing some avro fields using typesafe api

I'm trying to read an avro file using the Typesafe API and print some of it's fields.

The following simple code fails with the error

Caused by: java.lang.AssertionError: assertion failed: Arity of (class com.twitter.scalding.LowPriorityTupleSetters$$anon$1) is 1, which doesn't match: + ('app_id', 'host', 'ip', 'path')

Any idea?


class ReadAvroTest(args: Args) extends Job(args) {

  /**   * Read data from UnpackedAvro   */  case class DLR(appId: BytesWritable, host: BytesWritable, ip: BytesWritable, path: BytesWritable, query: BytesWritable, params: BytesWritable, requestHeaders: BytesWritable)

  val input = UnpackedAvroSource(args("input"))
    .read
    .toTypedPipe[DLR](
      'app_id, 'host, 'ip, 'path, 'query, 'params, 'request_headers)
}

class TestJob(args: Args) extends ReadAvroTest(args) {

  val groups = input
    .toPipe('app_id, 'host, 'ip, 'path)
    .write(Tsv(args("output")))
}



Anyone?


We don't use Avro at Twitter, so I'm not that familiar.
The error you are getting is that he case class you made cannot be automatically packed.
If you try it with a Tuple6 it may work. That said, what is your Avro schema? I doubt the byteswritable are correct either.


댓글 없음:

댓글 쓰기