How to run Mapreduce job from Eclipse


Creating map reduce job on eclipse and running it from back end could be cumbersome if we have errors in the code. Also in hadoop if you want to see the error messages you have to open the log files to check it. But we can debug our code in Eclipse only before creating a jar file and move it to hadoop.

Below are the steps to run mapreduce job from Eclipse:

Step1: Create a folder for input directory in your eclipse project structure and place your input file in that folder.


Here users.csv is input file.

Step2: We need to add some jar files to be able to run the program from eclipse. We can add it from your hadoop directory which in my case was ${HADOOP_HOME}/share/hadoop/common(If you are not able to find it, you can download all the files from interner), And add attached jar files to your eclipse by right clicking on JRE System Library –>Build Path–>Configure Build Path. Then add external jars:


Step 3: Configure the arguments in your project on Eclipse to run the job as below:

Right click on .Java file —>Run As–>Run Configurationsconfigutaionsetting

Step4: Run the program and your output directory will be created in Eclipse project only.

That’s how we run the job from eclipse.

Kindly let me know how did you like the post ..good/bad both so that I can improve if i need to add more good content or get motivated to create more pages like this.