php로 하둡 다루기 : Using Hadoop And PHP
2017.08.03 22:59
php로 하둡 다루기 : Using Hadoop And PHP
Using Hadoop And PHP
Getting Started
So first things first. If you haven’t used Hadoop before you’ll first need to download a Hadoop release and make sure you have Java and PHP installed. To download Hadoop head over to:
http://hadoop.apache.org/common/releases.html
Click on download a release and choose a mirror. I suggest choosing the most recent stable release. Once you’ve downloaded Hadoop, unzip it.
user@computer:$ tar xpf hadoop-0.20.2.tar.gz
I like to create a symlink to the hadoop-<release> directory to make things easier to manage.
user@computer:$ link -s hadoop-0.20.2 hadoop
Now you should have everything you need to start creating a Hadoop PHP job.
Creating The Job
For this example I’m going to create a simple Map/Reduce job for Hadoop. Let’s start by understanding what we want to happen.
- We want to read from an input system – this is our mapper
- We want to do something with what we’ve mapped – this is our reducer
At the root of your development directory, let’s create another directory called script. This is where we’ll store our PHP mapper and reducer files.
user@computer:$ ls
.
..
hadoop-0.20.2
hadoop-0.20.2.tar.gz
hadoop
user@computer:$ mkdir script
Now let’s being creating our mapper script in PHP. Go ahead and create a PHP file called mapper.php under the script directory.
user@computer:$ touch script/mapper.php
Now let’s look at the basic structure of a PHP mapper.
Code |
#!/usr/bin/php <?php //this can be anything from reading input from files, to retrieving database content, soap calls, etc. //for this example I'm going to create a simple php associative array. $a = array( 'first_name' => 'Hello', 'last_name' => 'World' ); //it's important to note that anything you send to STDOUT will be written to the output specified by the mapper. //it's also important to note, do not forget to end all output to STDOUT with a PHP_EOL, this will save you a lot of pain. echo serialize($a), PHP_EOL; ?>
So this example is extremely simple. Create a simple associative array and serialize it. Now onto the reducer. Create a PHP file in the script directory called reducer.php.
user@computer:$ touch script/reducer.php
Now let’s take a look at the layout of a reducer.
Code |
#!/usr/bin/php <?php //Remember when I said anything put out through STDOUT in our mapper would go to the reducer. //Well, now we read from the STDIN to get the result of our mapper. //iterate all lines of output from our mapper while (($line = fgets(STDIN)) !== false) { //remove leading and trailing whitespace, just in case
광고 클릭에서 발생하는 수익금은 모두 웹사이트 서버의 유지 및 관리, 그리고 기술 콘텐츠 향상을 위해 쓰여집니다.