正文今天新装了Ubuntu 17.10 感觉贼棒,具体怎么搞双系统大家伙可以看我以前的文章,很详细了。所以我就重新弄了个hadoop,希望本次能成功了。感觉功能更加接近mac了。图形界面也更加有好了。不过这也是趋势,没办法,计算机的内存,计算速度上来了,如果用户体验还是i这么差得花,linux很难全面铺开啊。所以现在好看,好用多了。
一 设置用户
废话也不多说了。直接进入正题吧。首先,我们要用hadoop还是创建一个超级用户(组)吧。
ubuntu@ubuntu:~$ sudo addgroup hadoop
ubuntu@ubuntu:~$ sudo adduser --ingroup hadoop hadoop
然后是一波赋予超级权限,当然实际生产不要这样了。
ubuntu@ubuntu:~$ sudo vim /etc/sudoers
改动的地方就在图中,hadoop那一行就是全部改动,奉劝用nano,vim不知道抽风还是怎么了,完全是只读。
二 配置Java环境
然后是Java环境,这个很简单那啦!上官网下载JDK就ok,然后解压到你的指定文件夹,这个文件夹就是你以后的JAVA_HOME 了。看我的:
之后到/etc/profile里面去改动:
export JAVA_HOME=/home/hustwolf/Downloads/jdk-9.0.4/
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
不过我居然还是没法在hadoop用户下使用java ,所以我改了/home/hadoop/.profile 加入了上面的那四句,然后运行source /et c/profile 就可以使用java了。看看安装没的第一手段是 java -version
三 配置ssh免密码登陆
然后要配置ssh:
hadoop@hustwolf-Inspiron-5447:~$ ssh-keygen -t rsa -P ""
# 如果出现要你输入啥哦,直接enter
hadoop@hustwolf-Inspiron-5447:~$cd /home/hadoop/.ssh/
hadoop@hustwolf-Inspiron-5447:~/.ssh$ cat id_rsa.pub >>authorized_keys
然后测试是否成功了。
hadoop@hustwolf-Inspiron-5447:~/.ssh$ ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:ryMzL3S70JO+KrbTDABZWONf/dCp4g.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 17.10 (GNU/Linux 4.13.0-25-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
0 个可升级软件包。
0 个安全更新。
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
hadoop@hustwolf-Inspiron-5447:~$ exit
注销
Connection to localhost closed.
跟上面一样就ok了,感觉是哦。
四 配置hadoop
上我以前写的一篇hadoop的文章去找一下最新的hadoop包下载,然后解压:
tar -xvzf hadoop.tar.gz
然后多了一个文件夹,把这个文件夹移动到hadoop的用户目录下去,最后就得到了如下的画面
(可能有时候会存在hadoop文件夹没法读写,那就把这个文件夹设置为所有人都可以读写。sudo chmod 777 hadoop)
另外,最好检查下你的hadoop是什么版本?
最下面出来了个64-bit LSB,so 我的是64位, 跟我的电脑搭配。
五 Hadoop文件配置
接下来就进行文件配置了。配置hadoop-env.sh文件(hadoop-env.sh文件在hadoop/etc/hadoop路径下面)
保存配置,并使其生效。
hadoop@hustwolf-Inspiron-5447:~/hadoop$ source etc/hadoop/hadoop-env.sh
再到/etc/profile中添加HADOOP_INSTALL并修改PATH,结果为
hadoop@hustwolf-Inspiron-5447:~/hadoop$ sudo nano /etc/profile
export JAVA_HOME=/home/hustwolf/Downloads/jdk-9.0.4/
export JRE_HOME=$JAVA_HOME/jre
export HADOOP_INSTALL=/home/hadoop/hadoop
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_INSTALL/bin:$HADOOP_INST$
export GST_ID3_TAG_ENCODING=GBK:UTF-8:GB18030
export GST_ID3V2_TAG_ENCODING=GBK:UTF-8:GB18030
此时我们就已经配置好了基本的hadoop环境了,看下面!!
hadoop@hustwolf-Inspiron-5447:~/hadoop$ hadoop version
Hadoop 2.9.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 756ebc8394e473ac25feac05fa493f6d612e6c50
Compiled by arsuresh on 2017-11-13T23:15Z
Compiled with protoc 2.5.0
From source with checksum 0a76a9a32a5257331741f8d5932f183
This command was run using /home/hadoop/hadoop/share/hadoop/common/hadoop-common-2.9.0.jar
hadoop@hustwolf-Inspiron-5447:~/hadoop$
当然这些不是重头戏,在下面呢!!
六 重头戏
现在运行一下hadoop自带的例子wordcount来感受以下MapReduce过程:
在hadoop目录下新建input文件夹
然后建立一个output吧:
hadoop@hustwolf-Inspiron-5447:~/hadoop$ sudo mkdir output
然后随便拷贝点东西进去?我试试先
hadoop@hustwolf-Inspiron-5447:~/hadoop$ sudo cp README.txt input/
运行wordcount程序,并将结果保存到output中(注意input所在路径、jar所在路径)
好的,测试过了。请大家回去把output删了。因为不需要啊!!!!
hadoop@hustwolf-Inspiron-5447:~/hadoop$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar wordcount input/README.txt output
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/home/hadoop/hadoop/share/hadoop/common/lib/hadoop-auth-2.9.0.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2018-01-19 22:44:02,302 INFO [main] Configuration.deprecation (Configuration.java:logDeprecation(1297)) - session.id is deprecated. Instead, use dfs.metrics.session-id
2018-01-19 22:44:02,306 INFO [main] jvm.JvmMetrics (JvmMetrics.java:init(79)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory file:/home/hadoop/hadoop/output already exists
基本看不懂,但是最后一句懂了啊。所以果断删除之。
hadoop@hustwolf-Inspiron-5447:~/hadoop$ sudo rmdir output
然后运行下面这句:
hadoop@hustwolf-Inspiron-5447:~/hadoop$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar wordcount input/README.txt output
结果如下!!!有结果!!!没错,你没看错,单机模式!!成功!!
~~~字数限制,不发了!
上面懒得看的话,就直接看结果就ok 了。 看我的目录和命令啊!!
hadoop@hustwolf-Inspiron-5447:~/hadoop/output$ cat part-r-00000
(BIS), 1
(ECCN) 1
(TSU) 1
(see 1
5D002.C.1, 1
~~~~~~
多好的东西啊。单机模式,成功!!!!第一次!!!果然是因为云服务器太菜了么?
溜了溜了!!!Happy
正文之后我承认,我是照本宣科的。but nothing ,给大家介绍我参考的教程:
人家写的比我的还多,但是!!我也有优势啊!!我新啊。我可以回答问题啊,不懂的评论或者简信见啊!!我长期在线的!