Semantic search is the immediate future of search which transforms content into self-describing data that can be automatically read directly or indirectly by software applications, making the data smarter so our searches are more productive. This is the promise of Web 3.0…
How important is search today?
Search has become such an important part of the fabric of our daily lives that it is hard to remember that it is only 21 years old this year. Looking at its past, we can see where it’s going.
From Professor Richard T. Griffiths at Leiden University in the Netherlands history of the Internet, we can look back and reminisce that the World Wide Web as we know it was launched in 1991, and “by the end of 1992 there were only 50 web-sites in the World and a year later the number was still no more than 150.”
As of June 2011, Netcraft’s monthly server survey found 346,004,403 web sites in the world. From 1 site to over 346 million sites in just 20 years! And WorldWideWebSize.com estimates that the Indexed Web contains at least 16.99 billion pages as of Sunday, 12 June, 2011, all of which contains data that should in theory be discoverable.
Unfortunately, with this astonishing proliferation of sites, it has become more and more difficult to find what we are looking for. As the amount of web sites and the data they contain increase, the ease of finding the useful information we really need decreases.
What has been the evolution of search?
Reports Griffith, “In 1990, the first Internet search-engine for finding and retrieving computer files, Archie, was developed at McGill University, Montreal.”
The Future of Search says we’ve gone through two generations and have entered the third:
The first generation based the search on what was on the Web page. Important factors, like keywords density and title. Meta tags had an important role, keywords in the domain name, and also some keywords in the URL. Search engines started looking like yellow pages.
The second generation based the page ranking on related links; but it looks like the days of huge link exchange programs are over.
The third generation is already underway, adding word stemming and a thesaurus on top of the term vector database to assist in keeping a search in context. The 3rd generation search engines will build personal profiles, based on past searching habits and the page vector (the keyword density per page).
When and why did out requirements around search change?
In a SearchEngineLand.com interview, John Battelle On The Future Of Search, search visionary John Battelle explained:
We had a very, very basic, well-understood use case [for user searching] for 10 years, which was Google or “like Google”—you put in a couple keywords and you get a response back. And that framework of searching and coming back with the best document to answer a query is morphing.
People are asking far more complicated questions now and they’re demanding far more nuanced answers, simply because they know they’re out there.
So where is search heading? There are 5 major trends:
1. Search should be organized by context.
Today, I would say that users want search to be organized around the types and context of search results desired: shopping, books, images, video, phone book style directory listings, friends’ and strangers’ opinions via social media, academic research, business research, entertainment, etc. We have started to see this in Google and Bing results, as well as the many specialized search engines like Expedia and Travelocity.
Battelle agrees, saying search is moving to consumers looking to focused search applications that provide searches around a specific topic like travel or consumer reviews rather than the generic search engine. “Search as an application where your first search isn’t the search itself but rather the search for the right application is a very, very different use case,” he says.
2. Search should be optimized to the device.
The days of search on the desktop being the sole force are numbered.
Says Gord Hotchkiss in an article in his Just Behave blog on SearchEngineLand.com, Five Visionaries Sum Up The Future Of Search: Part II:
One of the biggest catalysts of change in search has been the adoption of different devices from which we launch our searches… The search we launch from our smartphone can look substantially different than a search launched from a tablet or a desktop… We have different intents, different expectations and different ways of interacting with the device. One size fits all search just doesn’t fit that well any more.
3. Tablet and netbooks changing the future of computing- and search.
New search players are entering the marketplace to challenge Google, such as owners of key hardware devices and software applications like Apple and Facebook. Says John Battelle:
The iPad coming out has been an inflection point—an ah-ha moment where they realize that there is a new interface to computing coming. It’s very rare that you launch a new device that already has 140,000 applications built for it, and that’s a pretty big deal… So there’s a big search problem there and I think maybe Google would be wise to own that search problem.
I think it’s a phenomenally important piece of real estate, which is why Steve Jobs controls the whole end-to-end experience. Vertical integration is highly profitable. There’s just not going to be any crawling of the iTunes Store from a third-party developer native on iTunes. It’s going to be Steve Jobs who does search for iTunes. He may not do web search, but app search, that’s him. And now I think that there are opportunities to do that better and to do it across platforms, including netbooks and tablets.”
4. The semantic web will make the data smarter so the applications don’t have to be.
Another critical trend for the near future of search will be further developing what is being called the semantic web or semantic search.
Nova Spivack, former CEO of semantic web pioneering company, Radar Networks, addressed The 2008 Next Web Conference about his vision of Web 3.0 and the semantic web. He said (paraphrasing):
First there was metadata and tagging; then natural language search to understand what a person really means with their query.
However, traditional keyword search is not keeping up with the amount of data. It’s too easy now to create and publish information, so the volume of data is exploding. We need a smarter way of managing data.
Today, we are evolving semantic search which understands meaning and connection between data. Finally will come true artificial intelligence.
Spivack says Web 3.0 is really the current third decade of the Internet, and states it is driving the third generation of search, semantic search. He says Web 3.0 “will be about transforming what is a file server today into more like a database” where semantic search allows you to create metadata that goes into the data and describes the meaning in a common way. The net result of this is to make your data smarter so your software applications don’t have to be.
Spivack describes the sweeping vision for Search 3.0: “The dream of the semantic web is that all human knowledge will be on the web in machine-understandable format and that all software will be able to use all this knowledge.”
A use for this, describes Spivack, would be just-in-time data; because semantic search data is self-describing, it would allow a software application to pull in data appropriate to its type of application (such as pulling only health care data for a health care application) that it has never seen before without having seen a data schema and present it to the user.
By agreeing upon common metadata rules and naming conventions, authors of data could ensure their data will be available in the appropriate context for the appropriate users.
5. Social search will evolve to a model of curating or cataloging of meta tags.
However, unscrupulous data creators could still deliberately mis-tag data to have it appear erroneously. The applications will have to be smart enough to separate authorized meta data from unauthorized.
This is why for the foreseeable future, I see “crowdsourced tag policing” as a necessary component of search where users not only mark the content of a web page as inappropriate, but also note if the tags or categories for the data are themselves inappropriate.
But user tagging won’t work for a lot of data, particularly technical, scientific and academic data.
Does this mean meta tagging will go back to the DMOZ model from the beginning of the Internet where expert volunteers tagged web sites to the appropriate category, forming the backbone of Yahoo and other directory search sites?
It looks like expert meta tag curators like librarians, archivists and data catalogers have a rosy future ahead until artificial intelligence evolves.
READER QUESTION: So the question is where do you see the future of search? Add a comment to share your thoughts!