当前位置:首页 > 开发 > 开源软件 > 正文

eclipse远程连接hadoop进行开发测试

发表于: 2015-03-27   作者:duguyiren3476   来源:转载   浏览次数:
摘要: eclipse远程连接hadoop进行开发测试 马克飞象 由于搭建hadoop环境在仿真系统,在本地远程连接hdfs和提交mapreduce的job任务精力了曲折,现整理如下: hadoop环境 :hadoop2.5.2 jdk1.7 eclipse_luno hadoop_eclipse插件2.6; wordcount代码如下: package

eclipse远程连接hadoop进行开发测试

马克飞象 由于搭建hadoop环境在仿真系统,在本地远程连接hdfs和提交mapreduce的job任务精力了曲折,现整理如下:

  • hadoop环境 :hadoop2.5.2 jdk1.7 eclipse_luno hadoop_eclipse插件2.6; 
    wordcount代码如下:
package test;

import java.io.File;
import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {

  public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        context.write(word, one);
      }
    }
  }

  public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    conf.set("df.default.name", "hdfs://10.128.7.140:9000");
    conf.set("hadoop.job.user","hadoop");
// conf.set("mapred.job.tracker", "10.128.7.140:9001");
    Path in = new Path("hdfs://10.128.7.140:9000/test/test.txt");
    Path out = new Path("hdfs://10.128.7.140:9000/usr/output");
    out.getFileSystem(conf).delete(out, true);

    Job job = new Job(conf, "word——count");

// 
    File jarFile = EJob.createTempJar("bin");
    EJob.addClasspath("/home/hadoop/hadoop-1.2.1/conf");
    ClassLoader classLoader = EJob.getClassLoader();
    Thread.currentThread().setContextClassLoader(classLoader);
    ((JobConf) job.getConfiguration()).setJar(jarFile.toString()); 

    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, in);
    FileOutputFormat.setOutputPath(job,out);

    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}
  • 异常1 :找不到winutils.exe
2015-03-27 18:01:42,982 ERROR [main] util.Shell (Shell.java:getWinUtilsPath(373)) - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:355)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:370)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:363)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:78)
    at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:257)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:234)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:749)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:734)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:607)
    at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2748)
    at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:2740)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2606)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
    at test.WordCount.main(WordCount.java:73)
2015-03-27 18:01:43,812 INFO  [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1019)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2015-03-27 18:01:43,812 INFO  [main]

解决:下载winutils.exe文件: 
下载地址:https://github.com/srccodes/hadoop-common-2.2.0-bin 
将bin目录下的文件最好都下载到本地,然后替换本地hadoop安装目录下的bin目录 
然后设置HADOOP_HOME环境变量或者在javamain方法中代码设置:

System.setProperty("hadoop.home.dir", "D://hadoop");
  • 异常2 :如上,问题消失后又出来如下错误:
2015-03-27 18:07:38,072 INFO  [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(441)) - Cleaning up the staging area file:/tmp/hadoop-Administrator/mapred/staging/Administrator1538933894/.staging/job_local1538933894_0001
Exception in thread "main" java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:570)
    at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:977)
    at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:173)
    at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:160)
    at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:94)
    at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:285)
    at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:344)
    at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:150)
    at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:131)
    at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:115)
    at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:131)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
    at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
    at test.WordCount.main(WordCount.java:92)

解决办法:将刚下载的bin目录中的hadoop.dll文件放到目录C:\Windows\System32 目录下问题解决

  • eclipse_hadoop2.6的插件可以在附件中下载适用于eclipse luno(4.3)hadoop2.5 2.6

eclipse远程连接hadoop进行开发测试

  • 0

    开心

    开心

  • 0

    板砖

    板砖

  • 0

    感动

    感动

  • 0

    有用

    有用

  • 0

    疑问

    疑问

  • 0

    难过

    难过

  • 0

    无聊

    无聊

  • 0

    震惊

    震惊

编辑推荐
由于hadoop主要是部署和应用在linux环境中的,但是目前鄙人自知能力有限,还无法完全把工作环境转移
  由于hadoop主要是部署和应用在linux环境中的,但是目前鄙人自知能力有限,还无法完全把工作环境
0 环境如下: eclipse 远程 CentOS下的hadoop集群, 代码运行时报错如下: Exception in thread "ma
远程连接hadoop分布式环境 1、确保分布式环境版本与eclipse插件版本要一致(0.20.205.0),否则连接
使用eclipse的hadoop开发环境搭建 1.准备 1.1 完成linux上的hadoop集群安装,并且能够从WIN上远程到l
平常我们都是用windows开发,但是有时候需要将tomcat部署到linux下去运行,比如执行shell脚本。这个
个人小站,正在持续整理中,欢迎访问:http://shitouer.cn 小站博文地址:Windows 下配置 Eclipse 连接
1、hadoop 在 redhat linux下的安装过程 网上有很多讲授在windows下通过Cygwin安装hadoop的,笔者认
1、hadoop 在 redhat linux下的安装过程 网上有很多讲授在windows下通过Cygwin安装hadoop的,笔者认
此文章的前提是:知道hadoop是什么,知道什么是分布式系统,了解hdfs和mapreduce的概念和原理。这里
版权所有 IT知识库 CopyRight © 2009-2015 IT知识库 IT610.com , All Rights Reserved. 京ICP备09083238号