2011年5月8日 星期日

Using SequenceFileAsBinaryInputFormat & OutputFormat v0.21.0

1.First you have to make sure your input files are binary sequence files.
To more detail, please see
http://www.hadoop.tw/2008/12/hadoop-uncompressed-sequencefi.html
If you don't know how to Covert Text file to Binary sequence file, please see
http://kuanyuhadoop.blogspot.com/2011/05/coverting-text-file-to-binary-sequence.html

2.In the Main Class, you have to set the following things:
job.setOutputKeyClass(BytesWritable.class)
job.setOutputValueClass(BytesWritable.class)

job.setInputFormatClass(SequenceFileAsBinaryInputFormat.class)
job.setOutputFormatClass(SequenceFileAsBinaryOutputFormat.class)

3.In the Map Class
public class Map extends Mapper
{
public void map(BytesWritable key, BytesWritable value, Context context)
{
...
}
}

Reduce class is the same as Map Class...

沒有留言:

張貼留言