Hadoop 研究: Using SequenceFileAsBinaryInputFormat & OutputFormat v0.21.0

1.First you have to make sure your input files are binary sequence files.

To more detail, please see

http://www.hadoop.tw/2008/12/hadoop-uncompressed-sequencefi.html

If you don't know how to Covert Text file to Binary sequence file, please see

http://kuanyuhadoop.blogspot.com/2011/05/coverting-text-file-to-binary-sequence.html

2.In the Main Class, you have to set the following things:

job.setOutputKeyClass(BytesWritable.class)

job.setOutputValueClass(BytesWritable.class)

job.setInputFormatClass(SequenceFileAsBinaryInputFormat.class)

job.setOutputFormatClass(SequenceFileAsBinaryOutputFormat.class)

3.In the Map Class

public class Map extends Mapper

{

public void map(BytesWritable key, BytesWritable value, Context context)

{

...

}

Reduce class is the same as Map Class...

Hadoop 研究