Abstract:
Humans have a remarkable capability (perception) to perform a wide variety of physical and mental tasks without any measurements or computations. FamiHar examples of such tasks are: playing golf, assessing wine, recognizing distorted speech, and summarizing a story. The question is whether a special type information retrieval processing strategy can be designed that build in perception. Commercial Web search engines have been defined which manage information only in a crisp way. Their query languages do not allow the expression of prefer ences or vagueness. Even though techniques exist for locating exact matches, find ing relevant partial matches might be a problem. It may not be also easy to spec ify query requests precisely and completely-resulting in a situation known as a fuzzy-querying. It is usually not a problem for small domains, but for large reposi tories such as World Wide Web, a request specification becomes a bottleneck. Thus, a flexible retrieval algorithm is required, allowing for imprecise or fuzzy query specification or search. In addition, they have problems as follows : (1) large answer set; (2) low precision; (3) unable to preserve the hypertext structures of matching hyperdocuments; (4) ineffective for general-concept queries. The task is to use user-defined queries to retrieve useful information according to cer tain measures. In order to handle these problems, we propose the Perception In dex (PI) that contains attributes associated with a focal keyword restricted by fuzzy term(s) used in fuzzy queries on the Internet. If we integrate the Document Index (DI) used in commercial Web search engines with the proposed PI, we can handle both crisp terms (keyword-based) and fuzzy terms (perception-based). In this respect, the proposed approach is softer than the keyword-based approach. The PI brings somewhat closer to natural language. It is a further step toward a real human-friendly, natural language-based interface for Internet. It should greatly help the user relatively easily retrieve relevant information. In other words,