MLMI'07 Program

THURSDAY 28 JUNE 2007
08:00-09:00 Registration
09:00-09:10 Opening remarks
PAPER SESSION 1: Features for Speech Recognition
09:10-09:40 Using Prosodic Features in Language Models for Meetings
Songfang Huang and Steve Renals
09:40-10:10 Posterior-Based Features and Distances in Template Matching for Speech Recognition
Guillermo Aradilla and Hervé Bourlard
10:10-10:40 A Study of Phoneme and Grapheme Based Context-dependent ASR Systems
John Dines and Mathew Magimai Doss
10:40-11:10 - Coffee break
INVITED TALK 1
11:10-12:00 How to follow a conversation without listening to the words
Nick Campbell (Media Information Science Laboratories, ATR, Japan)
12:00-13:30 - Lunch break
PAPER SESSION 2: Separation and Segmentation in Spoken Interaction
13:30-14:00 Automatic Labeling Inconsistencies Detection And Correction for Sentence Unit Segmentation in Conversational Speech
Sebastien Cuendet, Dilek Hakkani-Tur and Elizabeth Shriberg
14:00-14:30 Modeling Vocal Interaction for Segmentation in Meeting Recognition
Kornel Laskowski and Tanja Schultz
14:30-15:00 Binaural Speech Separation Using Recurrent Timing Neural Networks for Joint F0-Localisation Estimation
Stuart Wrigley and Guy Brown
15:00-15:30 - Coffee break
SPECIAL SESSION: PASCAL Speech Separation Challenge II
15:30-16:00 To Separate Speech! A System for Recognizing Simultaneous Speech
John McDonough, Kenichi Kumatani, Tobias Gehrig, Emilian Stoimenov, Uwe Mayer, Stefan Schacht, Matthias Woelfel and Dietrich Klakow
16:00-16:30 A Microphone Array Beamforming Approach to Blind Speech Separation
Iain McCowan, Ivan Himawan and Mike Lincoln
POSTER SESSION 1
16:30-18:00 List of posters below

FRIDAY 29 JUNE 2007
PAPER SESSION 3: Image and Video Processing of Human Interaction
09:00-09:30 Conditional Sequence Model for Context-based Recognition of Gaze Aversion
Louis-Philippe Morency and Trevor Darrell
09:30-10:00 Face Recognition in Smart Rooms
Hazim Kemal Ekenel, Mika Fischer and Rainer Stiefelhagen
10:00-10:30 Meeting State Recognition from Visual and Aural Labels
Jan Curin, Pascal Fleury, Jan Kleindienst and Robert Kessl
10:30-11:00 - Coffee break
INVITED TALK 2
11:00-11:50 Structure and images
Vaclav Hlavac (Center for Machine Perception, Czech Technical University, Prague)
11:50-13:30 - Lunch break
PAPER SESSION 4: Annotation and Structuring of Spoken Input
13:30-14:00 Term-Weighting for Summarization of Multi-Party Spoken Dialogues
Gabriel Murray and Steve Renals
14:00-14:30 Automatic Annotation of Dialogue Structure from Simple User Interaction
Matthew Purver, John Niekrasz and Patrick Ehlen
14:30-15:00 Computer Assisted Pattern Recognition
Enrique Vidal, Luis Rodriguez, Francisco Casacuberta and Ismael García-Varea
15:00-15:30 - Coffee break
PAPER SESSION 5: Meeting Browsers and their Evaluation
15:30-16:00 An Ego-centric and Tangible Approach to Meeting Indexing and Browsing
Denis Lalanne, Florian Evequoz, Maurizio Rigamonti, Bruno Dumas and Rolf Ingold
16:00-16:30 Towards an Objective Test for Meeting Browsers: the BET4TQB Pilot Experiment
Andrei Popescu-Belis, Philippe Baudrion, Mike Flynn and Pierre Wellner
POSTER SESSION 2
16:30-18:00 List of posters below

SATURDAY 30 JUNE 2007
AMIDA TRAINING DAY
09:30-17:30 See AMIDA Training Day webpage
POSTER SESSION 1: THURSDAY 28 JUNE 2007, 16:30-18:00
Studying Multimodal Fusion and Fission Mechanisms Through the Constitution of an Open Source Toolkit Allowing Rapid Creation of Multimodal Interfaces
Bruno Dumas, Denis Lalanne, Rolf Ingold
Transfer Learning for Meeting-domain Tandem ASR Features
Joe Frankel, Ozgur Cetin, Nelson Morgan
Neural Network Topologies and Bottleneck Features in Speech Recognition
Frantisek Grezl, Martin Karafiat, Jan Cernocky
Automatic Decision Detection of Meeting Speech
Peiyun Hsueh, Jonathan Kilgour, Jean Carletta, Steve Renals, Johanna Moore
Channel Compensation for Speaker Recognition
Valiantsina Hubeika, Lukas Burget, Pavel Matejka, Jan Cernocky
In-Context Phone Posteriors as Complementary Features for Tandem ASR
Hamed Ketabdar, Herve Bourlard
Czech Text-to-Sign-Speech Synthesizer
Zdenek Krnoul, Jakub Kanis, Milos Zelezny, Ludek Muller
User-specific Training of a Music Search Engine
David Little, David Raffensperger, Bryan Pardo
Study on Correlation between ROUGE and Human Evaluation in Meeting Summarization
Feifan Liu, Yang Liu, Bin Li
Do Disfluencies Affect Meeting Summarization? A Pilot Study on the Impact of Disfluencies
Yang Liu, Feifan Liu, Bin Li, Shasha Xie
Frequency Domain Linear Prediction for QMF Subbands and Applications to Audio coding
Petr Motlicek, Sriram Ganapathy, Hynek Hermansky
LP-TRAP based Speech Features for ASR
Petr Motlicek, Hynek Hermansky
Cross Entropy for Learning in Multimodal Streams
Athanasios Noulas, Ben Krose, Nikos Vlassis
Audiovisual Interaction in the Control of Human Overt Attention
Cliona Quigley, Selim Onat, Peter Koenig, Sue Harding, Martin Cooke
Combination of Word and Phoneme Approach for Spoken Term Detection
Igor Szöke
Search in Meetings Using Combination of LVCSR- and Phonetic-based Spoken Term Detection
Igor Szöke, Michal Fapšo
Spoken Term Detection System Based on a Combination of LVCSR and Phonetic Search
Igor Szöke, Michal Fapšo, Martin Karafiát, Lukáš Burget, František Grézl, Petr Schwarz, Ondrej Glembek, Pavel Matejka, Jirí Kopecký, Jan Cernocký
Combined Visual Parameterization for Automatic Lip-Reading
Jana Trojanova, Petr Cisar, Milos Zelezny
Grapheme-based Spoken Term Detection in the Meetings Domain
Dong Wang, Joe Frankel, Simon King
The Listening Room - A Speech-based Interactive Art Installation
Alexa Wright, Alun Evans, Mike Lincoln
POSTER SESSION 2: FRIDAY 29 JUNE 2007, 16:30-18:00
Archivus: A User Performance Analysis with Speech, Keyboard and Mouse as Interaction Modalities
Marita Ailomaa, Agnes Lisowska
Adaptable User Modeling Component for Multimodal Interaction Markup Language
Masahiro Araki
Distributed Visual Sensor Network Fusion
Petr Chmelar, Jaroslav Zendulka
Multimodal Meeting Capture and Understanding with the CALO Meeting Assistant
Patrick Ehlen
Gaussian Process Latent Variable Models for Human Pose Estimation
Carl Henrik Ek, Philip Torr, Neil Lawrence
The Hub: Real Time Data Distribution and Storage
Mike Flynn, Maël Guillemot, Bastien Crettol
Hardware Acceleration of AdaBoost Classifier
Jiri Granat, Adam Herout, Michal Hradis, Pavel Zemcík
A Fully-automated Conference Recording and Webcasting System
Maël Guillemot, Alessandro Vinciarelli, Jean-Marc Odobez, Olivier Bornet, Olivier Masson
Automatic Camera-Selection for Meetings based on HMMs
Benedikt Hörnler, Marc Al-Hames, Gerhard Rigoll
Multi-Modal Interface for Information Access through Extraction and Visualization of Time-Series Information
Tsuneaki Kato, Mitsunori Matsushita
Indicative Abstractive Summaries of Meetings
Thomas Kleinbauer, Stephanie Becker, Tilman Becker
Analysis of the Multimodal Behaviour of Users in HCI: the Expert Viewpoint of Close Relations
Gilles Le Chenadec, Valérie Maffiolo, Noël Château
Joint Bi-Modal Face and Speaker Authentication Using Explicit Polynomial Expansion
Sébastien Marcel
System for Automatic Language Identification
Pavel Matejka
Demonstration: Archivus - a Multimodal Dialogue System for Meeting Browsing and Retrieval
Miroslav Melichar, Agnes Lisowska, Pavel Cenek, Marita Ailomaa, Martin Rajman
Evaluation and Comparison of Tracking Methods Using Meeting Omnidirectional Images
Igor Potucek, Vitezslav Beran, Stanislav Sumec, Pavel Zemcík
Object Category Recognition Using Probabilistic Fusion of Speech and Image Classifiers
Kate Saenko, Trevor Darrell
Evaluation of Automatic Video Editing
Stanislav Sumec, Igor Potucek
Computer Assisted Pattern Recognition: Demonstrations
Enrique Vidal, Luis Rodriguez, Francisco Casacuberta, Ismael García-Varea
Modeling Co-articulation by Visual Unit Selection in Czech Audio-Visual Speech Synthesis
Milos Zelezny, Zdenek Krnoul
Local Rank Differences - Novel Features for Image Processing
Pavel Zemcík, Michal Hradiš, Adam Herout