AIM@SHAPE KSPACE CNR ICT
Domain-restricted generation of semantic metadata from multimodal sources
Rationale

Most architectures for multimedia content structuring and organization rely on the availability of suitable metadata, which needs to be of increasingly higher semantic level to be of real use. Generating semantic information from media content is a very challenging task, and not solvable in the near term for the general case. Two approximations can be taken to help improve results:
(1) Restricting the subject and content of the media to a well-defined domain, which can then be adequately modelled. The area of news if of particular significance, since it contains a number of thematic subdomains suitable for deep modelling.
(2) Using all related available multimodal sources (audio, video, text) and taking advantage of previously generated metadata to boost results.
This session will show papers describing approaches to metadata generation at its different phases (feature extraction, content classification, semantic metadata mining), taken using the assumption of a priori information about the content domain and from different modalities of sources.

Session Outline

The session will contain a selection of relevant 4-5 papers, trying to cover different modalities and stages in the metadata production chain, preceded by a general framing of the process (and the place of each paper in it) presented by a session chair. The session is technically sponsored by the MESH project, an Integrated Project from the 6th Framework Programme dealing with semantic processing and syndication of multimedia news content.

Topics of Interest
  • Feature extraction for semantic concept detection from different modalities
  • High-level classification of media elements for restricted domains
  • Metadata processing and management for semantic enrichment
  • Speech recognition in pre-defined domains.
  • Multimedia analysis using associated text sources
  • Fusion of features extracted from different modalities
Contact information for the Special Session chairs

Jose M. Martinez
Universidad Autonoma de Madrid
JoseM.Martinez@uam.es

Paulo Villegas
Telefonica I+D
paulo@tid.es

Accepted Papers
  • Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition
    M. Huijbregts, R. Ordelman, F. de Jong
  • Video Summarisation for Surveillance and News Domain
    U. Damnjanovic, T. Piatrik, D. Djordjevic, E. Izquierdo
  • Ontology-Driven Semantic Video Analysis Using Visual Information Objects
    G. Th. Papadopoulos, V. Mezaris, I. Kompatsiaris, M. G. Strintzis
  • A Region Thesaurus Approach for High-Level Concept Detection in the Natural Disaster Domain (short paper)
    E. Spyrou, Y. Avrithis
  • Automatic Recommendations for Machine-Assisted Multimedia Annotation: a Knowledge-Mining Approach (short paper)
    M, Da­ez, P. Villegas
  • On the selection of MPEG-7 Visual Descriptors and their Level of Detail for Nature Disaster Video Sequences Classification (short paper)
    J. Molina, E. Spyrou, N. Sofou, J. M. Marti­nez
  • A Model-based Iterative Method for Caption extraction in Compressed MPEG Video (short paper)
    D. Marquez, J. Bescos