search engines, irc bots and python

It sounds like an intriguing combination, doesn’t it?

A while ago a pwmn’s intranet web search service was provided [the announcement was made here]. The application providing the service is yacy [1], which might be a bit immature, but was choosen for its future scalability (wifi link with awmn[2] is on the verge and new nodes in between peloponesse and central greece are emerging [3][4][5][6][7]). So the distributed application seemed great idea.

So far the whole service is based in out of stock yacy distribution with the tampering of some configuration files ( defaults/yacy.init , defaults/yacy.network.group ) and the addition of some more ( defaults/yacy.network.pwmn.unit ). The whole idea is to run some sort of the yacy’s freeworld (now named PWMN) over the wireless wifi given the principle of locality[8].

The service had good response among people and some started using in various ways. It was time to bring it closer to the masses and to make it accessible through our number one instant messaging protocol which is no other than irc [9]. The task was to provide itmy’s[10] python irc bot[11][12] with some “API” in order to communicate with the yacy search engine. Since the bot was written in python the easiest way to bind these things was the “glue” application between the bot and the search engine was through the python language. Here I have to say that even though I’m a newbie python programmer I continue using it, in favour of other languages that I prefer more. I guess the main reason is that its learning curve is GoDLiKe!

The following code is quite dumb. Since yacy 0.77stable the developers of yacy provide an xml compliant output (in contrast to html parsing currently done, so many of the following code needs rewritting [13], to eliminate the usage of mechanize in favour of urllib2 and some beautification on the tag parsing of the xml file - currently I’m parsing the html output of www server, so I consider the current version to be totally UGLY :twisted: ). Download the source code from  here  and enjoy

Leave a Comment

Name (required)

Mail (will not be published) (required)

Website

Comment