Nutch and Lucene in Eclipse or Netbeans

This entry is for helping you to programme with the Nutch’s API under Netbeans (I think it will work with Eclipse).

First of all, you should download and install Nutch. There are a lot of tutorials for that. Before go to the next step you shold have something like that:

Searching with Nutch
Searching with Nutch

Now, you want to create your own class in Netbeans. Create a new proejct in Netbeans and copy that:

package ull;

import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.nutch.util.NutchConfiguration;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.nutch.searcher.*;

public class Buscador {

public static void main(String[] args) {
Configuration conf = NutchConfiguration.create();
NutchBean bean;
Path searchdir = new Path(“/home/ivan/Documentos/proyecto/nutch1/crawl”);
try {
conf.set(“plugin.folders”,”/home/ivan/Descargas/nutch-0.9/build/plugins”);
bean = new NutchBean(conf,searchdir);
Query query = Query.parse(“enTodos”, conf);
Hits hits = bean.search(query, 10);
System.out.println(“Total hits: ” + hits.getTotal());
int length = (int) Math.min(hits.getTotal(), 10);
Hit[] show = hits.getHits(0, length);
HitDetails[] details = bean.getDetails(show);
Summary[] summaries = bean.getSummary(details, query);

for (int i = 0; i < hits.getLength(); i++) { System.out.println(" " + i + " " + details[i] + "\n" + summaries[i]); } } catch (IOException ex) { Logger.getLogger(Buscador.class.getName()).log(Level.SEVERE, null, ex); } } } [/sourcecode] Now, you have to add Nutch.jar and after that all the jars under lib folder.Right click in Library and choose Add external jar/folder for do that. The line conf.set("plugin.folders","/home/ivan/Descargas/nutch-0.9/build/plugins"); is for determining the folder where are the plugins. I know you should modify nutch-site.xml but it didn't work for me. If you do that you will avoid the errors: java.lang.RuntimeException: org.apache.nutch.searcher.QueryFilter not found.

&

java.lang.IllegalArgumentException: plugin.folders is not defined

Thats all!

If you want to debuggin all the Nutch project you can open it installing the free-form plugin in Netbeans.

Anuncios

18 comentarios sobre “Nutch and Lucene in Eclipse or Netbeans

  1. Add “/home/ivan/Descargas/nutch-0.9/” to the classpath in the example above and nutch will read the conf/nutch-default.xml file where this property can be found.

  2. I agree with you.
    However, I was creating a web service that execute nutch using Process.exec (“/…/nutch-0.9/bin/nutch”). Exec doesn’t include the classpath and that was the reason for that Nutch couldn’t find the nutch-default.xml file.

  3. I leave a comment each time I appreciate a article on a site or I have
    something to add to the conversation. It is caused by
    the fire communicated in the article I browsed.
    And on this article Nutch and Lucene in Eclipse or Netbeans Ivans blog.
    I was actually excited enough to post a comment 🙂 I do have
    a few questions for you if you tend not to mind. Is it
    just me or do some of these responses look as if they are left by brain dead folks?
    😛 And, if you are posting at additional online social sites, I would like to keep up with you.
    Would you make a list the complete urls of your community pages like your Facebook
    page, twitter feed, or linkedin profile?

  4. The other day, while I was at work, my cousin stole my apple ipad and tested to see if it
    can survive a forty foot drop, just so she can be a youtube sensation.
    My apple ipad is now broken and she has 83 views. I know this is totally off topic but I had to share it with someone!

  5. Sorry for the delay! But I have finishing some projects and I run out of free time.

    I stop researching with this kind of tool long time ago… Right now, I am developing tools for General Pourpose GPU computing.

    My facebook is just for personal stuff so, if you want to keep in touch with me: http://uk.linkedin.com/in/ivan85

    Thank you for your comment “good skin” 😉

  6. Hi would you mind letting me know which hosting company
    you’re utilizing? I’ve loaded your blog in 3 completely
    different internet browsers and I must say this blog loads
    a lot quicker then most. Can you suggest a good web hosting provider at a reasonable
    price? Thanks a lot, I appreciate it!

  7. Hi, i think that i noticed you visited my site thus i got
    here to return the want?.I am attempting to in finding things to improve my website!

    I assume its good enough to make use of some of your concepts!
    !

  8. Hello, There’s no doubt that your web site could be having internet browser compatibility problems. Whenever I look at your site in Safari, it looks fine however, if opening in IE, it has some overlapping issues. I just wanted to give you a quick heads up! Apart from that, excellent site!

  9. Hi there, I found your web site by way of Google while looking for a related matter,
    your site got here up, it seems to be good. I have bookmarked it in my google bookmarks.

    Hi there, simply was alert to your blog thru Google, and found that it is really informative.

    I’m gonna watch out for brussels. I’ll appreciate if you happen to proceed this in future.
    A lot of other folks might be benefited out of your writing.
    Cheers!

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s