Predicting relevance based on assessor disagreement: analysis and practical applications for search evaluation

December 26th, 2015

Evaluation of search engines relies on assessments of search results for selected test queries, from which we would ideally like to draw conclusions in terms of relevance of the results for general (e.g., future, unknown) users. In practice however, most evaluation scenarios only allow us to conclusively determine the relevance towards the particular assessor that provided the judgments. A factor that cannot be ignored when extending conclusions made from assessors towards users, is the possible disagreement on relevance, assuming that a single gold truth label does not exist. This paper presents and analyzes the predicted relevance model (PRM), which allows predicting a particular result’s relevance for a random user, based on an observed assessment and knowledge on the average disagreement between assessors. With the PRM, existing evaluation metrics designed to measure binary assessor relevance, can be transformed into more robust and effectively graded measures that evaluate relevance towards a random user. It also leads to a principled way of quantifying multiple graded or categorical relevance levels for use as gains in established graded relevance measures, such as normalized discounted cumulative gain, which nowadays often use heuristic and data-independent gain values. Given a set of test topics with graded relevance judgments, the PRM allows evaluating systems on different scenarios, such as their capability of retrieving top results, or how well they are able to filter out non-relevant ones. Its use in actual evaluation scenarios is illustrated on several information retrieval test collections.

Information Retrieval Journal pp 1-29

Convenient Discovery of Archived Video Using Audiovisual Hyperlinking

December 26th, 2015

This paper overviews ongoing work that aims to support end-users in conveniently exploring and exploiting large audiovisual archives by deploying multiple multimodal linking approaches. We present ongoing work on multimodal video hyperlinking, from a perspective of unconstrained link anchor identification and based on the identification of named entities, and recent attempts to implement and validate the concept of outside-in linking that relates current events to archive content. Although these concepts are not new, current work is revealing novel insights, more mature technology, development of benchmark evaluations and emergence of dedicated workshops which are opening many interesting research questions on various levels that require closer collaboration between research communities.

SLAM ’15 Proceedings of the Third Edition Workshop on Speech, Language & Audio in Multimedia

SAVA at MediaEval 2015: Search and Anchoring in Video Archives

December 26th, 2015

The Search and Anchoring in Video Archives (SAVA) task at MediaEval 2015 consists of two sub-tasks: (i) search for multimedia content within a video archive using multimodal queries referring to information contained in the audio and visual streams/content, and (ii) automatic selection of video segments within a list of videos that can be used as anchors for further hyperlinking within the archive. The task used a collection of roughly 2700 hours of the BBC broadcast TV material for the former sub-task, and about 70 files taken from this collection for the latter sub-task. The search subtask is based on an ad-hoc retrieval scenario, and is evaluated using a pooling procedure across participants submissions with crowdsourcing relevance assessment using Amazon Mechanical Turk (MTurk). The evaluation used metrics that are variations of MAP adjusted for this task. For the anchor selection sub-task overlapping regions of interest across participants submissions were assessed using MTurk workers, and mean reciprocal rank (MRR), precision and recall were calculated for evaluation.

Working notes of the MediaEval 2015 Workshop, Wurzen, Germany

Defining and Evaluating Video Hyperlinking for Navigating Multimedia Archives

December 26th, 2015

Multimedia hyperlinking is an emerging research topic in the context of digital libraries and (cultural heritage) archives. We have been studying the concept of video-to-video hyperlinking from a video search perspective in the context of the MediaEval evaluation benchmark for several years. Our task considers a use case of exploring large quantities of video content via an automatically created hyperlink structure at the media fragment level. In this paper we report on our findings, examine the features of the definition of video hyperlinking based on results, and discuss lessons learned with respect to evaluation of hyperlinking in real-life use scenarios.

WWW ’15 Companion Proceedings of the 24th International Conference on World Wide Web

User Perspectives on Semantic Linking in the Audio Domain

December 26th, 2015

Semantic linking has a potential to enrich the audiovisual experience for users of television or radio broadcast archives. Recently, automatic semantic linking, has received increased attention, especially as second screen applications for television broadcasts are emerging. Semantic linking for radio broadcasts can enrich radio listening experience in a similar manner in combination with second screen-like applications. While the development of such applications is gaining popularity, little is known about the information in a radio program that may be interesting for link creation from a user perspective. We conducted a user study on semantic linking for radio broadcasts in order to know what information users regard as suitable anchors and what kind of information they like as targets. We found that users often regard topic and person as the best link anchors in the program. Additionally, we found that frequency and timing of information elements in a radio program do not dominate the users’ selection of anchors. Furthermore, we found that there is a low agreement among users on regarding certain information elements as anchors. For practical reasons the study is conducted with 10 minutes of radio broadcast material of a particular program type, and with a total of 22 participants. The insights gained in the user study will help the understanding of user perspectives on semantic linking in the audio domain.

Signal-Image Technology and Internet-Based Systems (SITIS), 2014 Tenth International Conference on

The AXES research video search system

December 26th, 2015

We will demonstrate a multimedia content information retrieval engine developed for audiovisual digital libraries targeted at academic researchers and journalists. It is the second of three multimedia IR systems being developed by the AXES project1. The system brings together traditional text IR and state-of-the-art content indexing and retrieval technologies to allow users to search and browse digital libraries in novel ways. Key features include: metadata and ASR search and filtering, on-the-fly visual concept classification (categories, faces, places, and logos), and similarity search (instances and faces).

IEEE ICASSP – International Conference on Acoustics, Speech and Signal Processing , 4-9 May 2014, Florence, Italy.

Beyond metadata: searching your archive based on its audio-visual content

December 26th, 2015

The EU FP7 project AXES aims at better understanding the needs of archive users and supporting them with systems that reach beyond the state-of-the-art. Our system allows users to instantaneously retrieve content using metadata, spoken words, or a vocabulary of reliably detected visual concepts comprising places, objects and events. Additionally, users can query for new concepts, for which models are learned on-the-fly, using training images obtained from an internet search engine. Thanks to advanced analysis and indexation methods, relevant material can be retrieved within seconds. Our system supports different types of models for object categories (e.g. “bus” or “house”), specific objects (landmarks or logos), person categories (e.g. “people with moustaches”), or specific persons (e.g. “President Obama”). Next to text queries, we support query-by-example, which retrieves content containing the same location, objects, or faces shown in provided images. Finally, our system provides alternatives to query-based retrieval by allowing users to browse archives using generated links. Here we evaluate the precision of the retrieved results based on textual queries describing visual content, with the queries extracted from user testing query logs.

 

IBC2014 Conference,2014 page 1.3

 

Score Normalization Using Logistic Regression with Expected Parameters

December 26th, 2015

State-of-the-art score normalization methods use generative models that rely on sometimes unrealistic assumptions. We propose a novel parameter estimation method for score normalization based on logistic regression, using the expected parameters from past queries. Experiments on the Gov2 and CluewebA collection indicate that our method is consistently more precise in predicting the number of relevant documents in the top-n ranks compared to a state-of-the-art generative approach and another parameter estimate for logistic regression.

Advances in Information Retrieval
Volume 8416 of the series Lecture Notes in Computer Science pp 579-584

Mirex and Taily at TREC 2013

December 26th, 2015

We describe the participation of the Lowlands at the Web Track and the FedWeb track of TREC 2013. For the Web Track we used the Mirex Map-Reduce library with out-of-thebox approaches and for the FedWeb Track we adapted our shard selection method Taily for resource selection. Here, our results were above median and close to the maximum performance achieved.

Proceedings of the 22nd Text REtrieval Conference Proceedings (TREC)

Average Precision: Good Guide or False Friend to Multimedia Search Effectiveness?

December 26th, 2015

Approaches to multimedia search often evolve from existing approaches with strong average precision. However, work on search evaluation shows that average precision does not always capture effectiveness in terms of satisfying user needs because it ignores the diversity of search results. This paper investigates whether search approaches with diverse results have been neglected within the multimedia retrieval research agenda due the fact that they are overshadowed by search approaches with strong average precision. To this end, we compare 361 search approaches applied on the TrecVid benchmarks between 2005 and 2007. We motivate two criteria based on measure correlation and statistical equivalence to estimate whether search approaches with diverse results have been neglected. We show that hypothesized effect indeed occurs in the above examined collections. As a consequence, the research community would benefit from reconsidering existing approaches in the light of diversity.

MultiMedia Modeling
Volume 8326 of the series Lecture Notes in Computer Science pp 239-250