ramblings on PHP, SQL, the web, politics, ultimate frisbee and what else is on in my life

Managing infolation in search results

While Yahoo and Google show off their search-index sizes, I am waiting for some real innovation in the way that search results are presented. For example my number one gripe with search engines is that I mostly get redundant content and that links are returned that simply give me a link which contains my search term.

When I search for something work related, I end up with 1000 sites that mirror the PHP manual or any given blog post. Thats useless. But worse, there is no way to effectively filter out such crap (at least none that I am aware of) in the current search engine interfaces. What I would love to see is that search engines work harder at identifying the original source of content. Then instead of giving me the same content on tons of different sites with the only variation being the URL and the fluff (and ads) around the content, they could provide me with the original source (and optionally a list of sites that has the same content.)

Speaking of fluff around the actual content, it would be nice if search engines could identify which parts of a site are just navigation and ads and which part contains the actual content. Frequently I end up with a hit on a link to an age old page in the navigation. I presume the age old page is not presented, because .. well its old. But the new page contains the a link with the term, so its marked up for that. Otherwise I cannot explain why I get a page with a link before the page with the content and the link text in the title.

Another annoying thing are these sites like expert exchange that just tease you with a question that someone posed, but the answer requires a paid account. Of course these sites SEO like crazy hoping that someone desperate enough will pay their yearly subscription. But for the most part people are not interested in questions, they are interested in answers. Not really sure how to deal with that, maybe there should be a personal black list one can maintain to kick these sites out of search results or at least make them grey or something.


Re: Managing infolation in search results

A tip for the next time you're looking for an answer on the expert exchange site: just scroll to the bottom of the page (after the ads and the gibberish), all the answers are in clear text there. Yes, I'm not kidding.

Before you can post a comment please solve the following captcha.

your name: