摘自象书
一个Job里可以从多个同质或异质的输入源读取数据,并使用各自的Mapper
MultipleInputs.addInputPath(conf, ncdcInputPath, TextInputFormat.class, MaxTemperatureMapper.class) MultipleInputs.addInputPath(conf, metOfficeInputPath, TextInputFormat.class, MetOfficeMaxTemperatureMapper.class);
MultiOutputFormat可以让你按一定规则指定、分隔reduce output的文件名,如
... static class StationNameMultipleTextOutputFormat extends MultipleTextOutputFormat<NullWritable, Text> { private NcdcRecordParser parser = new NcdcRecordParser(); protected String generateFileNameForKeyValue(NullWritable key, Text value, String name) { parser.parse(value); return parser.getStationId(); } } ...
另有MultiOutputs类,在此不表