首先 我們要建立一個新的 class 名稱可自訂
(如果沒建新的class 在Reducer就會出現 NoSuchMethodException)
假設我們的Class 名稱為DoubleTwoDArrayWritable
我們要Create一個DoubleTwoDArrayWritable.java file
裡面寫
import org.apache.hadoop.io.*;
public class DoubleTwoDArrayWritable extends TwoDArrayWritable
{
public DoubleTwoDArrayWritable()
{
super(DoubleWritable.class);
}
}
這樣我們就建立了一個新的datatype了
之後設定 job.setMapOutputValueClass(DoubleTwoDArrayWritable.class);
(若要設為Key的話 還要implement必較方法 目前似乎沒支援)
在Mapper中,要給值還要過一個Writable[][]
private Text one = new Text("1");
private DoubleTwoDArrayWritable WTW = new DoubleTwoDArrayWritable();
private DoubleWritable [][] Result = null;
if (Result == null)
{
Result = new DoubleWritable[r_int][];
for (i=0;i < r_int ;i++)
{
Result[i] = new DoubleWritable[r_int];
for (j=0;j
Result[i][j] = new DoubleWritable();
}
for ( i=0; i
for ( j=i ; j
{
tmpR = W[i]*W[j];
Result[i][j].set(tmpR);
Result[j][i].set(tmpR);
}
WTW.set(Result);
context.write(one,WTW);
而在Reducer中,也要過一個Writable[][] 來接值 @@"
private Writable [][] getArray = null;
for (DoubleTwoDArrayWritable value : values)
{
getArray = value.get();
for ( i=0; i<r_int ; i++ )
for ( j=0 ; j<r_int ; j++ )
C[i][j] = C[i][j] + ((DoubleWritable)getArray[i][j]).get();
}
Could you please update the post with an example.what is your tmpR and W?
回覆刪除Hi, thanks for interested in this article. Basically, tmpR is just a temp double value, and W is a vector with r_int double values.
回覆刪除You can use any value you want.
This article is just told you a way to pass 2D double matrix from Mapper to Reducer without converting to string. We want to do so because converting to string might cause some precision error, and needs more network bandwidth.