Skip to content | Change text size

GSIT Workshop June 12 2006

Workshop on Video/Image Processing

Conference Room (4N-251), GSIT, Monash University

Time: 10:00AM of 12th June 2006

We invite all interested people

Speakers:

Professor David Suter   

Ashfaqur Rahman

Hang Zhuo

Salahuddin Al Azad

 Ee-Hu Lim

Mortuza Ali

Nghia Ho

 

The appearance of inexpensive and evermore powerful processors coupled with faster network access, the continuing expansion of the Internet, and a significant increase in both research and standardisation have all contributed to the infrastructure of modern video/image processing technology. This technology has supported, and continues to enable, a raft of multimedia applications as diverse as home and disaster zone monitoring, video-on-demand, videoconferencing, cellular videophones, remote-sensing, tele-medicine, interactive multimedia databases, multimedia videotex, computer games, multimedia annotation, communication aids for deaf people, security surveillance, and broadcasting and streaming. Over the last decade, a number of researchers  engaged themself in this area. Monash University is one of the pioneers in this field.

This workshop will explore the innovations of current researchers of two Monash Research groups:

 

 We hope that it will provide an excellent opportunity for members of both sides to interact and to discuss common issues and problems as well as other researchers related to this field. We invite all interested people.

The topics covers:

Organizer

Manoranjan Paul and Manzur Murshed, GSIT, Monash University

Program

10:00 -10:10  Introduction - Manoranjan Paul and A/Prof. Manzur Murshed

10:10 - 11:10 Professor David Suter

11:10 - 11:30 Ashfaqur Rahman

11:30 - 12:00 Hang Zhuo

12:00 - 12:20 Salahudding Al Azad

12:20 - 12:50 Ee Hu Lim

12:50 - 1:10 Mortuza Ali

1:10 - 1:40 Nghia Ho

Talks

Speaker: Professor David Suter

Title: An overview of the projects undertaken in the Institute for Vision Systems Engineering  

My group is working in the extraction and exploitation of information contained in images. Much of the work is "low-level" primitive information extraction: e.g., what part of the image correspond to the objects of potential interest, how are they moving through the image sequence (tracking), how do you find the same object or person in different views, etc. Some of the work involves higher level information (e.g., face recognition, gait and activity recognition, human pose and motion capture). Moreover, we "stray" into areas such as graphics (human motion and capture), image and film restoration, biomedical image analysis; and we use information from sensors other than cameras (e.g., laser scanners). This talk will give an overview of the work currently carried out by the group.

David Suter holds a BSc in Applied Mathematics and Physics (Flinders), a Grad. Dip. Comp. (RMIT) and a PhD (La Trobe) in Computer Science with a thesis title containing the usual mix of important sounding words relating (both the title and the thesis) to computer vision. He has lectured at La Trobe (Lecturer 1988-1991) and at Monash (Senior Lecturer, Assoc. Prof., Professor 1992-2005, Prof. 2005-present) Universities.

His research activities focuses on topics such as: motion estimation from images (including optic flow), structure from motion, image segmentation, biomedical image analysis, human motion capture and animation, visual tracking, activity detection and classification, face recognition, and the construction of building models from laser scan and image data. He is a member of the ARC Centre for Perceptive and Intelligent Machines in Complex Environments, and the Institute for Vision Systems Engineering (the latter he directs).

 Currently he is an associate editor for the International Journal of Computer Vision, and for the journal Machine Vision and Applications (having also just finished a stint as an associate editor for the International Journal of Image and Graphics).

Speaker: Ashfaqur Rahman

Title: Analyzing temporal textures using approximated motion measures

This talk presents a set of works to analyze temporal textures using their distinctive dynamics captured by approximated motion measures. Temporal textures represent motion patterns that exhibit spatiotemporal regularity. The motion assembly by a flock of flying birds, water streams, fluttering leaves, and waving flags are some of the most common examples of this type of motion. Analyzing temporal textures has important implications in state-of-art computer vision research including automated video surveillance and robot navigation etc. Dynamics of temporal textures is identified by their motion and regularities in their movement patterns are captured by motion distribution statistics. We will present an overview of characterizing and synthesyzing such dynamic textures using underlying motion distribution statistics.

About the presenter: Ashfaqur Rahman completed his B. Sc. Engg from Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology in 2001. He was a lecturer in the same department and now pursuing his PhD at GSIT, Monash University from June 2003.

Speaker: Hang Zhuo

Title: Wireless Video Transmission over CDMA

Wireless video transmission over CDMA network enables transfer of high quality live video via public wireless broadband which is accessible anywhere with CDMA mobile phone network coverage. Real time video can be received on an internet connected PC, laptop or even mobile phone at the user’s fingertip. This makes possible surveillance from moving vehicles, or quick deployment in almost unrestricted locations. While many general purpose solutions exist for sending video over CDMA, they have not taken into consideration certain specific features of the network nor the requirements of the related applications. We present here a new architecture for video transmission over CDMA with simple hardware complexity, high reliability and rich functions. The compact and flexible system has been under trial in China, Australia, Singapore, Italy, Egypt and Brazil, etc.

Title:Urban Modeling for Video Walkthrough and Surveillance

With growing demand in the area of town planning, precise navigation and 3D video surveillance, the need for automatic and efficient urban modeling is more urgent than ever. The complexity of application requirements and technology challenges make urban modeling an intensive research area.  Our research focuses on reliable extraction of man-made structures from 3D terrestrial laser scanner point clouds and color images which is the crucial step in urban modeling. The variability exhibited in 3D and 2D data as well as a large amount of clutter which is always present make it a challenging problem.Our approach starts from extracting multiple coherent man-made structures from 3D point clouds and regions from the corresponding color images, and then labels them based on prior models of the objects expected.A local structure based minimum volume ellipsoid (MVE) estimator is used to compensate for the non-uniform and directional nature of point clouds data sampled in outdoor environments. For color images, causal multi-scale random field (MSRF) is used as a prior model on the class labels to capture spatial dependencies of the labels as well as the image data. 

 

Speaker: Salahuddin Al Azad

Title: New strategies for efficient video-on-demand systems

Video-on-demand provides a mechanism for watching a selected video at any time independent of the choice of other viewers as if they were watching a rental cassette. Although VOD has been around for many years, commercially successful deployment of it is limited to some metropolitan areas since customers are not willing to pay for VOD more than what they currently pay for rental video cassette. The I/O and network bandwidth requirement for VOD deployment is still very high due to the fact that client population is likely to be very large with different clients asynchronously issuing requests which exacerbated by VCR function requests. Fortunately, evolution of broadcast and multicast techniques in data networks helped reducing the bandwidth requirement for VOD significantly by replacing  true VOD systems with near VOD (NVOD) systems. NVOD systems are scalable, however customers have to wait for a while before service starts. The major problem in an  NVOD system is to improve the response time while keeping bandwidth requirement within limit. The aim of our project is to improve the current state-of-art VOD systems to contribute to their easy deployment and commercial success. Our contributions include the development of a number of new NVOD schemes with much lower response time depending on the resource constraint and what type of service customers expect.

Speaker: Ee-Hu Lim

Title: Generation of 3D Urban Model from LIDAR Data for Video Walkthrough and Surveillance

3D urban modelling is increasingly important with applications such as regional planning and environmental simulations. Photogrammetry and LIDAR are methods for the urban point clouds data acquisition. The research is focused on using LIDAR data to build a high-quality ground-based 3D urban model for video walkthrough and video surveillance. The equipment used in the research is a Riegl LMS-Z420i Terrestrial Laser Scanner equipped with a calibrated Nikon D100 6 Mega Pixel digital camera (14mm lens) which is mounted on top of the scanner.

The presentation will show the proposed framework which includes a classification stage that process and divides the raw 3D point clouds into terrain, off-terrain and vegetation data. We aim to represent the off-terrain points which mainly consist of building surface data with solid geometry models. Due to the existence of a large number of structures in the building data, plane fitting is difficult to perform accurately. Therefore the original point clouds are divided into cubes and plane fitting is performed on each segments. The detected plane patches are grouped by analysing their co-normality and co-linearity. Hypotheses about geometric constraints between planes are implemented to improve the parameters of the final building models.

A further issue discussed in the presentation is image occlusion removal using minimal images taken at different time from the same location. The algorithm adopted is based on the concept that when an occlusion occurs there is generally a discontinuity around the boundary of the occlusion. The technique is capable of solving occlusions where the background is only seen once in the sets of occluded images.

Ee Hui has completed a BE (Hons) in Electrical and Computer System Engineering at Monash University in 2004 and has been working on a PhD degree at Monash University from March 2005. Working under the supervision of Prof David Suter and Prof Ray Jarvis, she is in the group of Monash University’s Intelligent Robotics Research Centre (IRRC) and Institute for Vision System Engineering.

Speaker: Mortuza Ali

Title: Lossless video coding using distributed source coding techniques

State-of-the-art video coding standards, e.g. H.263/4 and MPEG-1/2/4, have been developed primarily for applications where some loss of information without much degradation of the visual quality is acceptable. Many applications such as archiving master copies of digital movies, and capturing medical videos, however, indicate the growing demand and importance of efficient lossless video compression algorithms. The lossless video compression schemes proposed in the literature aim to combine the techniques from the existing lossless image compression and lossy block based motion compensated video coding standards. In contrast to conventional approach we have adopted a completely different paradigm namely distributed source coding paradigm, for lossless video compression. Distributed source coding is radically different from the conventional coding, and heavily depends on the concepts of source coding, channel coding, and estimation theory. This talk will present how distributed source coding techniques can be used in efficient lossless video coding.

Mortuza Ali received his BSc in Computer Science & Engineering from Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh in 2001. He joined as a postgraduate research student at Gippsland School of Computing and Information Technology (GSCIT), Monash University in June 2004. Before coming here he was a lecturer in the Department of Computer Science & Engineering, BUET. His research interests are in the fields of Distributed Source Coding, Distributed Video Coding, and Lossless Video Coding.

Speaker: Nghia Ho

Title: Localisation for mobile robots using vision and laser range data

Mobile robot navigation consists of moving a robotic vehicle through an obstacle field on an efficient collision-free path to a nominated goal. Knowing where the robot is(localisation) in the context of its working environment is critical to this task. In some cases, the initially unknown environment is, itself, mapped as the robot localises within it.

However, in many realistic situations a map of the environment is available yet navigation still requires the location of the robot to be determined so that a devised path through the environment might be followed. This talk reports on the task of sensor driven, natural landmark based localisation in an environment modelled using 3D range and image data extracted from a Riegl LMS-Z420i Terrestrial Laser Scanner.