Posted by brant : 2009-03-30 at 6:44 pm

Wikipedia is a huge site so one might assume that they offer their articles in easy to use RSS Feeds.  Well, they don't and not only that but its overly complex.  All I want is to grab the "Summary" of Wiki entries.  I've spent hours on this one and here are my findings.

Let's start with the Wikipedia API.  I thought that it would be simple, something like http://en.wikipedia.org/w/api.php&query=Microsoft&format=xml would be the extent of my troubles.  Boy was I wrong.  I'll give you a second to load their api and read some of the examples.  Seem overwhelming?  Did you really think this would be easy?  How silly.  When you finally do seem to return results of entries, they are in Wikipedia formating.  DOH!  Now we need another step.

I've searched the web for a website that allows you to query the article and parses the formating to readable html.  Then I found just the site I needed to query the data, format it, and offer it in an easy to use rss feed.  Turns out that site no longer exists today.  Then I come across MediaWiki which makes no sense to me as how it can help me.  So I chose a different route...

Let's look for ways to parse the data from Wikipedia.  There are many options to do this, so many in fact that you want to rip your own eyeballs out because none of them are easy and straightforward with adequate documentation.  PHP users may come across Wiki_text which is available in the Pear framework.  Good luck using that with no documentation.

I was really hoping to end this post with the method in which I used to grab Wikipedia entries.  Unfortunately I cannot help you.  However, if one day I am so inclined to give it another shot and do find an answer then I will be sure to make a blog post about it.

Until then I ask you to chime in and post information on how to grab wikipedia content in readable format in an RSS feed.

Am I the idiot or what?



Similar Blogs:
Flash Based Websites Suck

blog comments powered by Disqus

OLD Comments (1)


another idiot - 2009-07-01 at 06:24:57
yes u r an idiot.wasted my 5mins