How to Create a Custom Writable for Hadoop

If you have gone through other Hadoop MapReduce examples, you will have noticed the use of “Writable” data types such as LongWritable, IntWritable, Text, etc… All values in used in Hadoop MapReduce must implement the Writable interface.

Although we can do a lot with the primitive Writables already available with Hadoop, there are often times when we want to transmit a variety of data and/or data types from Mapper to Reducer. Sometimes it is possible to convert all these data into strings and concatenate them to result in a single key or single value. However, this can get very messy, and is not recommended.

Implementing Writable requires implementing two methods, readFields(DataInput in) and write(DataOutput out). Writables that are used as keys in MapReduce jobs must also implement Comparable (or simply WritableComparable). Overriding the toString() method is not necessary, but can be very helpful when storing your output data as text in HDFS.

Below is an example of a custom Writable that is used to store both gender and login information. An example of using this might be to calculate login statistics based on gender. Notice that this custom class, GenderLoginWritable, does not implement Comparable or WritableComparable, so it can only be used as a value in the MapReduce framework, not as a key.

Documentation for the Writable interface can be found here: Hadoop Writable Interface.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">