Loading [MathJax]/extensions/MathMenu.js
Video-to-text information fusion evaluation for level 5 user refinement | IEEE Conference Publication | IEEE Xplore

Video-to-text information fusion evaluation for level 5 user refinement


Abstract:

Video-to-Text (V2T) fusion is an example of coordinating low-level information fusion (LLIF) with high-level information fusion (HLIF) through semantic descriptions of ph...Show More

Abstract:

Video-to-Text (V2T) fusion is an example of coordinating low-level information fusion (LLIF) with high-level information fusion (HLIF) through semantic descriptions of physical information. Using hard (e.g., video) and soft (i.e., text) data fusion affords Level 5 User Refinement of object characterization, target tracking, and situation assessment. Building on our previous video-to-text (V2T) Fusion2014 paper, we extend the method for evaluation of eight tracking methods compared for extraction of semantic information including target number, category, attribute, and direction. Using the CMUSphinx speech-to-text system for semantic parsing of user call-outs, preliminary results show the integration of video tracking and text analysis is better with the compressive tracker (CT) and the Tracking-Learning-Detection (TLD) method. The feature analysis of the CT and TLD demonstrate the ability to associate user call-out text-based semantic descriptors with video exploitation. The results are presented in a visualization tool for rapid production to aid user refinement (HLIF) and object assessment (LLIF) functions.
Date of Conference: 06-09 July 2015
Date Added to IEEE Xplore: 17 September 2015
ISBN Information:
Conference Location: Washington, DC, USA

References

References is not available for this document.