Synthesis Lectures on Information Concepts, Retrieval, and Services
3 total works
Digital libraries (DLs) have evolved since their launch in 1991 into an important type of information system, with widespread application. This volume advances that trend further by describing new research and development in the DL field that builds upon the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework, which is discussed in three other DL volumes in this series.While the 5S framework may be used to describe many types of information systems, and is likely to have even broader utility and appeal, we focus here on digital libraries.
Drawing upon six (Akbar, Kozievitch, Leidig, Li, Murthy, Park) completed and two (Chen, Fouh) in-process dissertations, as well as the efforts of collaborating researchers, and scores of related publications, presentations, tutorials, and reports, this book demonstrates the applicability of 5S in five digital library application areas, that also have importance in the context of the WWW, Web 2.0, and innovative information systems. By integrating surveys of the state-of-the-art, newresearch, connections with formalization, case studies, and exercises/projects, this book can serve as a textbook for those interested in computing, information, and/or library science.
Chapter 1 focuses on images, explaining how they connect with information retrieval, in the context of CBIR systems. Chapter 2 gives two case studies of DLs used in education, which is one of the most common applications of digital libraries. Chapter 3 covers social networks, which are at the heart of work onWeb 2.0, explaining the construction and use of deduced graphs, that can enhance retrieval and recommendation. Chapter 4 demonstrates the value of DLs in eScience, focusing, in particular, on cyber-infrastructure for simulation. Chapter 5 surveys geospatial information in DLs, with a case study on geocoding.
Given this rich content, we trust that any interested in digital libraries, or in related systems, will find this volume to be motivating, intellectually satisfying, and useful. We hope it will help move digital libraries forward into a science as well as a practice. We hope it will help build community that will address the needs of the next generation of DLs.
Drawing upon six (Akbar, Kozievitch, Leidig, Li, Murthy, Park) completed and two (Chen, Fouh) in-process dissertations, as well as the efforts of collaborating researchers, and scores of related publications, presentations, tutorials, and reports, this book demonstrates the applicability of 5S in five digital library application areas, that also have importance in the context of the WWW, Web 2.0, and innovative information systems. By integrating surveys of the state-of-the-art, newresearch, connections with formalization, case studies, and exercises/projects, this book can serve as a textbook for those interested in computing, information, and/or library science.
Chapter 1 focuses on images, explaining how they connect with information retrieval, in the context of CBIR systems. Chapter 2 gives two case studies of DLs used in education, which is one of the most common applications of digital libraries. Chapter 3 covers social networks, which are at the heart of work onWeb 2.0, explaining the construction and use of deduced graphs, that can enhance retrieval and recommendation. Chapter 4 demonstrates the value of DLs in eScience, focusing, in particular, on cyber-infrastructure for simulation. Chapter 5 surveys geospatial information in DLs, with a case study on geocoding.
Given this rich content, we trust that any interested in digital libraries, or in related systems, will find this volume to be motivating, intellectually satisfying, and useful. We hope it will help move digital libraries forward into a science as well as a practice. We hope it will help build community that will address the needs of the next generation of DLs.
Digital libraries (DLs) have introduced new technologies, as well as leveraging, enhancing, and integrating related technologies, since the early 1990s. These efforts have been enriched through a formal approach, e.g., the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework, which is discussed in two earlier volumes in this series. This volume should help advance work not only in DLs, but also in the WWW and other information systems.
Drawing upon four (Kozievitch, Murthy, Park, Yang) completed and three (Elsherbiny, Farag, Srinivasan) in-process dissertations, as well as the efforts of collaborating researchers and scores of related publications, presentations, tutorials, and reports, this book should advance the DL field with regard to at least six key technologies. By integrating surveys of the state-of-the-art, new research, connections with formalization, case studies, and exercises/projects, this book can serve as a computing or information science textbook. It can support studies in cyber-security, document management, hypertext/hypermedia, IR, knowledge management, LIS, multimedia, and machine learning.
Chapter 1, with a case study on fingerprint collections, focuses on complex (composite, compound) objects, connecting DL and related work on buckets, DCC, and OAI-ORE. Chapter 2, discussing annotations, as in hypertext/hypermedia, emphasizes parts of documents, including images as well as text, managing superimposed information. The SuperIDR system, and prototype efforts with Flickr, should motivate further development and standardization related to annotation, which would benefit all DL and WWW users. Chapter 3, on ontologies, explains how they help with browsing, query expansion, focused crawling, and classification. This chapter connects DLs with the Semantic Web, and uses CTRnet as an example. Chapter 4, on (hierarchical) classification, leverages LIS theory, as well as machine learning, and is important for DLs as well as the WWW. Chapter 5, on extraction from text, covers document segmentation, as well as how to construct a database from heterogeneous collections of references (from ETDs); i.e., converting strings to canonical forms. Chapter 6 surveys the security approaches used in information systems, and explains how those approaches can apply to digital libraries which are not fully open.
Given this rich content, those interested in DLs will be able to find solutions to key problems, using the right technologies and methods. We hope this book will help show how formal approaches can enhance the development of suitable technologies and how they can be better integrated with DLs and other information systems.
Drawing upon four (Kozievitch, Murthy, Park, Yang) completed and three (Elsherbiny, Farag, Srinivasan) in-process dissertations, as well as the efforts of collaborating researchers and scores of related publications, presentations, tutorials, and reports, this book should advance the DL field with regard to at least six key technologies. By integrating surveys of the state-of-the-art, new research, connections with formalization, case studies, and exercises/projects, this book can serve as a computing or information science textbook. It can support studies in cyber-security, document management, hypertext/hypermedia, IR, knowledge management, LIS, multimedia, and machine learning.
Chapter 1, with a case study on fingerprint collections, focuses on complex (composite, compound) objects, connecting DL and related work on buckets, DCC, and OAI-ORE. Chapter 2, discussing annotations, as in hypertext/hypermedia, emphasizes parts of documents, including images as well as text, managing superimposed information. The SuperIDR system, and prototype efforts with Flickr, should motivate further development and standardization related to annotation, which would benefit all DL and WWW users. Chapter 3, on ontologies, explains how they help with browsing, query expansion, focused crawling, and classification. This chapter connects DLs with the Semantic Web, and uses CTRnet as an example. Chapter 4, on (hierarchical) classification, leverages LIS theory, as well as machine learning, and is important for DLs as well as the WWW. Chapter 5, on extraction from text, covers document segmentation, as well as how to construct a database from heterogeneous collections of references (from ETDs); i.e., converting strings to canonical forms. Chapter 6 surveys the security approaches used in information systems, and explains how those approaches can apply to digital libraries which are not fully open.
Given this rich content, those interested in DLs will be able to find solutions to key problems, using the right technologies and methods. We hope this book will help show how formal approaches can enhance the development of suitable technologies and how they can be better integrated with DLs and other information systems.
Key Issues Regarding Digital Libraries
by Rao Shen, Marcos Andre Goncalves, and Edward A Fox
Published 1 February 2013
This is the second book based on the 5S (Societies, Scenarios, Spaces, Structures, Streams) approach to digital libraries (DLs). Leveraging the first volume, on Theoretical Foundations, we focus on the key issues of evaluation and integration. These cross-cutting issues serve as a bridge for those interested in DLs, connecting the introduction and formal discussion in the first book, with the coverage of key technologies in the third book, and of illustrative applications in the fourth book. These two topics have central importance in the DL field, allowing it to be treated scientifically as well as practically. In the scholarly world, we only really understand something if we know how to measure and evaluate it. In the Internet era of distributed information systems, we only can be practical at scale if we integrate across both systems and their associated content.
Evaluation of DLs must take place atmultiple levels,so we can address the different entities and their associated measures. Thus, for digital objects, we assess accessibility, pertinence, preservability, relevance, significance, similarity, and timeliness. Other measures are specific to higher-level constructs like metadata, collections, catalogs, repositories, and services.We tie these together through a case study of the 5SQual tool, which we designed and implemented to perform an automatic quantitative evaluation of DLs. Thus, across the Information Life Cycle, we describe metrics and software useful to assess the quality of DLs, and demonstrate utility with regard to representative application areas: archaeology and education.
Though integration has been a challenge since the earliest work on DLs, we provide the first comprehensive 5S-based formal description of the DL integration problem, cast in the context of related work. Since archaeology is a fundamentally distributed enterprise, we describe ETANADL, for integrating Near Eastern Archeology sites and information. Thus, we show how 5S-based modeling can lead to integrated services and content.
While the first book adopts a minimalist and formal approach to DLs, and provides a systematic and functional method to design and implement DL exploring services, here we broaden to practical DLs with richer metamodels, demonstrating the power of 5S for integration and evaluation.
Evaluation of DLs must take place atmultiple levels,so we can address the different entities and their associated measures. Thus, for digital objects, we assess accessibility, pertinence, preservability, relevance, significance, similarity, and timeliness. Other measures are specific to higher-level constructs like metadata, collections, catalogs, repositories, and services.We tie these together through a case study of the 5SQual tool, which we designed and implemented to perform an automatic quantitative evaluation of DLs. Thus, across the Information Life Cycle, we describe metrics and software useful to assess the quality of DLs, and demonstrate utility with regard to representative application areas: archaeology and education.
Though integration has been a challenge since the earliest work on DLs, we provide the first comprehensive 5S-based formal description of the DL integration problem, cast in the context of related work. Since archaeology is a fundamentally distributed enterprise, we describe ETANADL, for integrating Near Eastern Archeology sites and information. Thus, we show how 5S-based modeling can lead to integrated services and content.
While the first book adopts a minimalist and formal approach to DLs, and provides a systematic and functional method to design and implement DL exploring services, here we broaden to practical DLs with richer metamodels, demonstrating the power of 5S for integration and evaluation.