Designing multimodal video search by examples (MVSE) user interfaces: UX requirements elicitation and insights from semi-structured interviews

Kyle Boyd, Patrick McAllister, Maurice Mulvenna, Raymond Bond, Hui Wang, Ivor Spence, Guanfeng Wu, Abbas Haider

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Downloads (Pure)

Abstract

In order to search for content from large video archives, it is typically undertaken via keyword queries using predefined metadata such as title and other tags. However, it is difficult to use keywords to search for specific moments in a video. Video search by examples is a desirable approach for this scenario as it allows users to search for content using one or more examples without having to specify a keyword. However, video search by examples is notoriously challenging, and performance is poor. To improve search performance, multiple modalities may be considered – image, sound, voice and text, multiple search cues could be used to identify more relevant content. This is multimodal video search by examples (MVSE), where users can search for content using multiple modalities. In this paper, typical end users - BBC archivists, programme support staff - are interviewed to identify how their search needs can be addressed with the technical capabilities of a MVSE tool. Such a search tool will be useful for organisations such as the BBC who maintain large collections of video archives and want to provide a search tool for their own staff as well as for the public. It will also be useful for companies such as Youtube who host videos from the public and want to enable video search by examples. The study’s objectives explored in this paper were to inform the design and development of the UX workflows to gain a broader understanding of what opportunities and issues may arise from the proposed prototype tool. Results from the thematic analysis was highlighted 4 main themes: Opportunities, Time constraints, Activities, and Pain points. Further analysis highlighted key areas that should be considered for an MVSE-based system, such as scene recognition, face recognition, speed issues, and integration.

Original languageEnglish
Title of host publicationECCE'23: proceedings of the European Conference on Cognitive Ergonomics
PublisherAssociation for Computing Machinery
Number of pages8
ISBN (Electronic)9798400708756
DOIs
Publication statusPublished - 21 Sept 2023
EventEuropean Conference on Cognitive Ergonomics 2023 - Swansea, United Kingdom
Duration: 19 Sept 202322 Sept 2023
https://digitaleconomy.wales/ecce2023/

Conference

ConferenceEuropean Conference on Cognitive Ergonomics 2023
Abbreviated titleECCE 2023
Country/TerritoryUnited Kingdom
CitySwansea
Period19/09/202322/09/2023
Internet address

Fingerprint

Dive into the research topics of 'Designing multimodal video search by examples (MVSE) user interfaces: UX requirements elicitation and insights from semi-structured interviews'. Together they form a unique fingerprint.

Cite this