Advanced Metasearch Engine Technology

Advanced Metasearch Engine Technology
Title Advanced Metasearch Engine Technology PDF eBook
Author Weiyi Meng
Publisher Morgan & Claypool Publishers
Total Pages 130
Release 2011
Genre Computers
ISBN 1608451925

Download Advanced Metasearch Engine Technology Book in PDF, Epub and Kindle

Among the search tools currently on the Web, search engines are the most well known thanks to the popularity of major search engines such as Google and Yahoo . While extremely successful, these major search engines do have serious limitations. This book introduces large-scale metasearch engine technology, which has the potential to overcome the limitations of the major search engines. Essentially, a metasearch engine is a search system that supports unified access to multiple existing search engines by passing the queries it receives to its component search engines and aggregating the returned results into a single ranked list. A large-scale metasearch engine has thousands or more component search engines. While metasearch engines were initially motivated by their ability to combine the search coverage of multiple search engines, there are also other benefits such as the potential to obtain better and fresher results and to reach the Deep Web. The following major components of large-scale metasearch engines will be discussed in detail in this book: search engine selection, search engine incorporation, and result merging. Highly scalable and automated solutions for these components are emphasized. The authors make a strong case for the viability of the large-scale metasearch engine technology as a competitive technology for Web search. Table of Contents: Introduction / Metasearch Engine Architecture / Search Engine Selection / Search Engine Incorporation / Result Merging / Summary and Future Research

Advanced Metasearch Engine Technology

Advanced Metasearch Engine Technology
Title Advanced Metasearch Engine Technology PDF eBook
Author Weiyi Meng
Publisher Springer Nature
Total Pages 117
Release 2022-05-31
Genre Computers
ISBN 3031018435

Download Advanced Metasearch Engine Technology Book in PDF, Epub and Kindle

Among the search tools currently on the Web, search engines are the most well known thanks to the popularity of major search engines such as Google and Yahoo!. While extremely successful, these major search engines do have serious limitations. This book introduces large-scale metasearch engine technology, which has the potential to overcome the limitations of the major search engines. Essentially, a metasearch engine is a search system that supports unified access to multiple existing search engines by passing the queries it receives to its component search engines and aggregating the returned results into a single ranked list. A large-scale metasearch engine has thousands or more component search engines. While metasearch engines were initially motivated by their ability to combine the search coverage of multiple search engines, there are also other benefits such as the potential to obtain better and fresher results and to reach the Deep Web. The following major components of large-scale metasearch engines will be discussed in detail in this book: search engine selection, search engine incorporation, and result merging. Highly scalable and automated solutions for these components are emphasized. The authors make a strong case for the viability of the large-scale metasearch engine technology as a competitive technology for Web search. Table of Contents: Introduction / Metasearch Engine Architecture / Search Engine Selection / Search Engine Incorporation / Result Merging / Summary and Future Research

Deep Web Query Interface Understanding and Integration

Deep Web Query Interface Understanding and Integration
Title Deep Web Query Interface Understanding and Integration PDF eBook
Author Eduard C. Dragut
Publisher Springer Nature
Total Pages 150
Release 2022-05-31
Genre Computers
ISBN 3031018893

Download Deep Web Query Interface Understanding and Integration Book in PDF, Epub and Kindle

There are millions of searchable data sources on the Web and to a large extent their contents can only be reached through their own query interfaces. There is an enormous interest in making the data in these sources easily accessible. There are primarily two general approaches to achieve this objective. The first is to surface the contents of these sources from the deep Web and add the contents to the index of regular search engines. The second is to integrate the searching capabilities of these sources and support integrated access to them. In this book, we introduce the state-of-the-art techniques for extracting, understanding, and integrating the query interfaces of deep Web data sources. These techniques are critical for producing an integrated query interface for each domain. The interface serves as the mediator for searching all data sources in the concerned domain. While query interface integration is only relevant for the deep Web integration approach, the extraction and understanding of query interfaces are critical for both deep Web exploration approaches. This book aims to provide in-depth and comprehensive coverage of the key technologies needed to create high quality integrated query interfaces automatically. The following technical issues are discussed in detail in this book: query interface modeling, query interface extraction, query interface clustering, query interface matching, query interface attribute integration, and query interface integration. Table of Contents: Introduction / Query Interface Representation and Extraction / Query Interface Clustering and Categorization / Query Interface Matching / Query Interface Attribute Integration / Query Interface Integration / Summary and Future Research

Probabilistic Ranking Techniques in Relational Databases

Probabilistic Ranking Techniques in Relational Databases
Title Probabilistic Ranking Techniques in Relational Databases PDF eBook
Author Ihab Ilyas
Publisher Springer Nature
Total Pages 71
Release 2022-05-31
Genre Computers
ISBN 303101846X

Download Probabilistic Ranking Techniques in Relational Databases Book in PDF, Epub and Kindle

Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challenges in query semantics and processing, making conventional methods inapplicable. Furthermore, the interplay between ranking and uncertainty models introduces new dimensions for ordering query results that do not exist in the traditional settings. This lecture describes new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on discussing the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries. Under the tuple-level uncertainty model, we describe new processing techniques leveraging the capabilities of relational database systems to recognize and handle data uncertainty in score-based ranking. Under the attribute-level uncertainty model, we describe new probabilistic ranking models and a set of query evaluation algorithms, including sampling-based techniques. We also discuss supporting rank join queries on uncertain data, and we show how to extend current rank join methods to handle uncertainty in scoring attributes. Table of Contents: Introduction / Uncertainty Models / Query Semantics / Methodologies / Uncertain Rank Join / Conclusion

Business Information Systems

Business Information Systems
Title Business Information Systems PDF eBook
Author Witold Abramowicz
Publisher Springer
Total Pages 352
Release 2015-06-15
Genre Computers
ISBN 331919027X

Download Business Information Systems Book in PDF, Epub and Kindle

This book contains the refereed proceedings of the 18th International Conference on Business Information Systems, BIS 2015, held in Poznań, Poland, in June 2015. The BIS conference series follows trends in academic and business research; thus, the theme of the BIS 2015 conference was “Making Big Data Smarter.” Big data is now a fairly mature concept, recognized and widely used by professionals in both research and industry. Together, they work on developing more adequate and efficient tools for data processing and analyzing, thus turning "big data" into "smart data." The 26 revised full papers were carefully reviewed and selected from 70 submissions. In addition, two invited papers are included in this book. They are grouped into sections on big and smart data, semantic technologies, content retrieval and filtering, business process management and mining, collaboration, enterprise architecture and business−IT alignment, specific BIS applications, and open data for BIS.

Instant Recovery with Write-Ahead Logging

Instant Recovery with Write-Ahead Logging
Title Instant Recovery with Write-Ahead Logging PDF eBook
Author Goetz Graefe
Publisher Springer Nature
Total Pages 113
Release 2022-05-31
Genre Computers
ISBN 3031018575

Download Instant Recovery with Write-Ahead Logging Book in PDF, Epub and Kindle

Traditional theory and practice of write-ahead logging and of database recovery focus on three failure classes: transaction failures (typically due to deadlocks) resolved by transaction rollback; system failures (typically power or software faults) resolved by restart with log analysis, "redo," and "undo" phases; and media failures (typically hardware faults) resolved by restore operations that combine multiple types of backups and log replay. The recent addition of single-page failures and single-page recovery has opened new opportunities far beyond the original aim of immediate, lossless repair of single-page wear-out in novel or traditional storage hardware. In the contexts of system and media failures, efficient single-page recovery enables on-demand incremental "redo" and "undo" as part of system restart or media restore operations. This can give the illusion of practically instantaneous restart and restore: instant restart permits processing new queries and updates seconds after system reboot and instant restore permits resuming queries and updates on empty replacement media as if those were already fully recovered. In the context of node and network failures, instant restart and instant restore combine to enable practically instant failover from a failing database node to one holding merely an out-of-date backup and a log archive, yet without loss of data, updates, or transactional integrity. In addition to these instant recovery techniques, the discussion introduces self-repairing indexes and much faster offline restore operations, which impose no slowdown in backup operations and hardly any slowdown in log archiving operations. The new restore techniques also render differential and incremental backups obsolete, complete backup commands on a database server practically instantly, and even permit taking full up-to-date backups without imposing any load on the database server. Compared to the first version of this book, this second edition adds sections on applications of single-page repair, instant restart, single-pass restore, and instant restore. Moreover, it adds sections on instant failover among nodes in a cluster, applications of instant failover, recovery for file systems and data files, and the performance of instant restart and instant restore.

P2P Techniques for Decentralized Applications

P2P Techniques for Decentralized Applications
Title P2P Techniques for Decentralized Applications PDF eBook
Author Esther Pacitti
Publisher Springer Nature
Total Pages 90
Release 2022-06-01
Genre Computers
ISBN 3031018885

Download P2P Techniques for Decentralized Applications Book in PDF, Epub and Kindle

As an alternative to traditional client-server systems, Peer-to-Peer (P2P) systems provide major advantages in terms of scalability, autonomy and dynamic behavior of peers, and decentralization of control. Thus, they are well suited for large-scale data sharing in distributed environments. Most of the existing P2P approaches for data sharing rely on either structured networks (e.g., DHTs) for efficient indexing, or unstructured networks for ease of deployment, or some combination. However, these approaches have some limitations, such as lack of freedom for data placement in DHTs, and high latency and high network traffic in unstructured networks. To address these limitations, gossip protocols which are easy to deploy and scale well, can be exploited. In this book, we will give an overview of these different P2P techniques and architectures, discuss their trade-offs, and illustrate their use for decentralizing several large-scale data sharing applications. Table of Contents: P2P Overlays, Query Routing, and Gossiping / Content Distribution in P2P Systems / Recommendation Systems / Top-k Query Processing in P2P Systems