Documentos: Recent submissions

  • Greiff Warren R; Ponte Jay M (2000)
    This paper takes a fresh look at modeling approaches to information retrieval that have been the basis of much of the probabilistically motivated IR research over the last 20 years. We shall adopt a subjectivist Bayesian ...
  • Aridor Yariv; Carmel David; Maarek Yoelle S; Soffer Aya (2002)
    Mobile knowledge seekers often need access to information on the Web during a meeting or on the road, while away from their desktop. A common practice today is to use pervasive devices such as Personal Digital Assistants ...
  • Carpineto Claudio; Romano Giovanni; Bordoni Fondazione Ugo; Giannini Vittorio; A Telecomunicazioni S P (2002)
    In this article we consider methods for automatic query expansion from top retrieved documents (i.e., retrieval feedback) that make use of various functions for scoring expansion terms within Rocchio's classical reweighting ...
  • Chowdhury Abdur; Frieder Ophir; Grossman David; Mccabe Mary Catherine (2002)
    We present a new algorithm for duplicate document detection that uses collection statistics. We compare our approach with the state-of-the-art approach using multiple collections. These collections include a 30 MB 18,577 ...
  • Buyukkokten Orkut; Kaljuvee Oliver; Garcia-Molina Hector; Paepcke Andreas; Winograd Terry (2002)
    We present a design and implementation for displaying and manipulating HTML pages on small handheld devices such as personal digital assistants (PDAs), or cellular phones. We introduce methods for summarizing parts of Web ...
  • Cooper Brian F; Garcia-Molina Hector (2002)
    Data archiving systems rely on replication to preserve information. This paper discusses how a network of autonomous archiving sites can trade data to achieve the most reliable replication. A series of binary trades among ...
  • Zhu Lei; Rao Aibing; Zhang Aidong (2002)
    The success of text-based retrieval motivates us to investigate analogous techniques which can support the querying and browsing of image data. However, images differ significantly from text both syntactically and semantically ...
  • Heinz Steffen; Zobel Justin; Williams Hugh E (2002)
    Many applications depend on efficient management of large sets of distinct strings in memory. For example, during index construction for text databases a record is held for each distinct word in the text, containing the ...
  • Finkelstein Lev; Gabrilovich Evgeniy; Matias Yossi; Rivlin Ehud; Solan Zach; Wolfman Gadi; Ruppin Eytan (2002)
    Keyword-based search engines are in widespread use today as a popular means for Web-based information retrieval. Although such systems seem deceptively simple, a considerable amount of skill is required in order to satisfy ...
  • Lempel Ronny; Lempel R; Lempel R; Soffer A (2002)
    We describe PicASHOW, a fully automated WWW image retrieval system that is based on several link-structure analyzing algorithms. Our basic premise is that a page p displays (or links to) an image when the author of p ...
  • Feng Ling; Chang Elizabeth (2002)
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for describing and interchanging data among various systems and databases on the Internet. It offers the Document Type Definition (DTD) as a ...
  • Bharat Krishna; Mihaila George A (2002)
    In response to a query, a search engine returns a ranked list of documents. If the query is about a popular topic (i.e., it matches many documents), then the returned list is usually too long to view fully. Studies show ...
  • Cannane Adam; Williams Hugh E (2002)
    Compression of large collections can lead to improvements in retrieval times by offsetting the CPU decompression costs with the cost of seeking and retrieving data from disk. We propose a semistatic phrase-based approach ...
  • Wen Ji-Rong; Research Microsoft; Nie Asia Jian-Yun (2002)
    Query clustering is a process used to discover frequently asked questions or most popular topics on a search engine. This process is crucial for search engines based on question-answering. Because of the short lengths of ...
  • Williams Hugh E; Zobel Justin; Bahle Dirk (2004)
    Search engines need to evaluate queries extremely fast, a challenging task given the quantities of data being indexed. A significant proportion of the queries posed to search engines involve phrases. In this article we ...
  • Pennock David M; Park S.-T; Yahoo (2004)
    A lexical signature (LS) consisting of several key words from a Web document is often sufficient information for finding the document later, even if its URL has changed. We conduct a large-scale empirical study of nine ...
  • Fox Steve; Karnawat Kuldeep; Dumais Susan; White Thomas (2005)
    Of growing interest in the area of improving the search experience is the collection of implicit user behavior measures (implicit measures) as indications of user interest and user satisfaction. Rather than having to submit ...
  • Brafman Ronen I; Shimony Solomon E; Brafman; Shimony S E (2004)
    We present a new approach for adaptive presentation of structured information, based on preference-based constrained optimization techniques rooted in qualitative decision-theory. In this approach, document presentation ...
  • King Irwin; Cheuk Hang; Ng Ka Cheung; Sia (2004)
    With the recent advances of distributed computing, the limitation of information retrieval from a centralized image collection can be removed by allowing distributed image data sources to interact with each other for data ...
  • Sander J ¨ Org; Ng Raymond T; Sleumer Monica C; Yuen Man Saint; Jones Steven J; Ng M C; Sleumer M S; Yuen (2005)
    Serial Analysis of Gene Expression (SAGE) has proven to be an important alternative to microarray techniques for global profiling of mRNA populations. We have developed preprocessing methodolo-gies to address problems in ...

Search DSpace


Browse

My Account