CogViS at the University of Hamburg

[Introduction] [Mailinglists] [Bibliography] [Video] [Projects]

Introduction

CogViS stands for Cognitive Vision Systems.

Please visit the project's main web-site at http://cogvis.nada.kth.se/ for additional information.

[Introduction] [Mailinglists] [Bibliography] [Video] [Projects]

KOGS/CSL internal mailinglists

Three mailing lists local to KOGS/CSL have been created, these are

cogvis (at) kogs.informatik.uni-hamburg.de:: Researchers connected with CogViS at KOGS/CSL
cogvis-stud (at) kogs.informatik.uni-hamburg.de:: Students (undergraduates) connected with CogViS at KOGS/CSL
cogvis-alle (at) kogs.informatik.uni-hamburg.de:: combines cogvis and cogvis-stud

For internal use, there's also a repository of many mails send to the cogvis list, and in particular all the protocols written.

[Introduction] [Mailinglists] [Bibliography] [Video] [Projects]

Annotated Bibliography

As part of CogViS an annotated Bibliography will be created. This will use BibTeX as its back-end, and the CSL-Part will be search-able here. If you are working from inside CSL you can also edit it online by clicking here.

[Introduction] [Mailinglists] [Bibliography] [Video] [Projects]

Video Capture

As part of CogViS we set up an environment which allows us to capture images from 3 synchronised video-cameras at a resolution of 1024x779 and a frame-rate of up to 38 frames per second. Up to 32 minutes of consecutive frames can be grabbed using this setup.

Hardware used

We are using 2 1-chip RGB (Bayer filter) cameras DFD-5013-HS, and one monochrome camera DMD-5013-HS. The cameras use a 10-bit LVDS signal for video-out, which allows the capture of non-interlaced images exceeding the maximum size of the PAL or NTSC standards most analogue cameras adhere to.

Each camera is connected to a Matrox Meteor II/Dig digital framegrabber in a dedicated PC running Windows NT 4.0. The PCs use a Gigabyte GA 7VTXE+ Socket A mainboard with an AMD Athlon XP 1800+ processor (at 1533MHz). Each PC has two IBM 61.4GB HDD IC35L060 (we used the c't's HDD-benchmark H2Benchw to time disk-writes. These tests show that only the first 50% of the disk are fast enough for our purpose, i.e. a sustained write-rate of more than 15MB/s), each one as master on it's own UDMA100 IDE channel (although a Teac CD-540E CD-ROM drive is also connected to one of them as a slave).

For synchronised capture, the output of a function generator (5V peak-to-peak square pulse) is connected with the TTL trigger-input of all three framegrabbers.

Software used

As the Matrox Meteor II/Dig ships with it's own library, but without much precompiled (or even only prewritten) software, we had to write our own routines for sequence capture. Basically, these use two separate threads to

capture images into a ring-buffer
write images to the two disks alternatingly.

As there seems to be a bug in the official synchronisation-mechanism (i.e. the end of a grab is signalled long before the actual image was written to memory) we simply make sure that grabbing precedes writing by at least three images. We use a ringbuffer of 128 images, but so far haven't encountered any need for a buffer of more than 12 images in practice. The actual source-code can be obtained from Sven Utcke (anybody working at KOGS/CSL can also look here for more information).

[Introduction] [Mailinglists] [Bibliography] [Video] [Projects]

Projects

This will give an overview over the different approaches taken within our group.

Calibration, Yildirim Karal

As mentioned above, we use up to 3 synchronised cameras to take image-sequences of our sample scene. These need to be calibrated, so that:

the lens distortions can be corrected. We need to use wide-angle lenses (8.5mm with a 1/3" sensor) in order to maximise the field of view, and these cause barrel-shaped distortions.
the three cameras' relative orientation is known. Based on this we will be able to try out several different wide-baseline stereo approaches.

Image of our sample scenery and the 3 cameras

Yildirim Karal, a student of Hamburg University, is working on both aspects for his 3rd year project ("Studienarbeit"). As of Nov. 2002 he is developing the algorithms for the lens calibration. So far he can take an image of chequered paper under an arbitrary angle (Figure a), find the pixels belonging to lines by adaptive thresholding (Figure b) and find approximate lines through these points by Hough-transform (Figure c+d). These will be used to determine an approximation of the angle of projection and from there both the exact angle of projection and the lens-distortion simultaneously.


a)	b)	c)	d)

Eigenfaces, Joshua Buttkus

As CogVis will need to be able to regonize a large number of different objects and from differen positions even for our relatively simple szenario, we decided to use appearance based methods for the actual recognition. Joshua Buttkus is a 4th year student working for us on a contract basis. He is currently implemeting standard eigenfaces, experimenting mostly with plates at the moment, and will expand this to anti-faces in the near future. A short example (for plates) is given below:

: 3 templates of the training set (20 templates) from which the eigenfaces have been calculated.
: 3 of the eigenfaces calculated from the training set.
: The average face of the training set.
: The first image is again a template from the training set. The image on the right hand side is the reconstruction, or rather the projection of the left hand side image into the face space. The face space is spanned by the orthonormalized eigenfaces/eigenvectors of the trainings-set's scatter-/covariance-matrix.

Tracking, Rainer Herzog

This is another 3rd-year project, trying to develope simple blob-tracking, not unlike the Leeds tracker, but using a different colour space. So far, only a static background is used in background substraction, but this already gives rather promising results (if you ignore, for a moment, the guy who was standing in the top left corner of the background image :-).

Bayer Images:
RGB Images:
Difference Images:

Probabilistic Modelling, Peter Lueders

For his Master thesis ("Diplomarbeit") Peter Lueders is working on learning probabilistic models describing sequences of actions. The work is focussed on Bayesian networks and Bayesian clustering by dynamics.

Scene interpretation, Thomas Weiss

Mechanisms to interpret scenes on a high-level are of great interest for CogVis. Thomas Weiss' main tasks belong to this area. On one hand he develops and verifies the theoretical basis. Actually different approaches ranging from logic to probability calculi are being combined. On the other hand a (mostly) object-oriented simulation software is developed using Lisp. Thought as a constant feedback for verifying theoretical issues he analyses software-runs and the developed architecture. Currently he is working on three topics:

Development of scene interpretations as instances of abstract concepts. These concepts form a hierarchy of specialisations and should be learned from real world phenomena. A concept may be an aggregation of other concepts which describes (in form of constraints) the spatial and temporal relations among them.
Integration of Bayes nets into the theoretical basis. At this point it appears possible to seamlessly integrate Bayes nets. As such they are meant to be a step towards solving tasks like part-whole/whole-part reasoning and controlling focus of attention.
Creating a strategy for interpreting scenes. In a combined bottom-up/top-down search concepts, relations and related probabilistic information are used to guide the process.

Thomas Weiss is a PhD Student paid by CogVis.

N.N.

Typical high-level concepts which must be recognized in high-level vision are composed of multiple objects underlying temporal and spatial constraints. Our guiding example is "setting the table" which is modelled as a concept composed of loosely coordinated individual placement actions. The idea is to learn such a concept from observations. We investigate both supervised and unsupervised learning techniques.

Amar Isli

Amar Isli is a Post-Doc paid partly by the CogVis project. His area of interest is how to develop languages for representing, and reasoning about moving spatial scenes. A short description of his work can be found here.

Spacial Reasoning, Peer Stelldinger

Image of some spatial arrangement of plate, cup, and knife

One major aspect of the CogVis project is the question how to reason about spacial relations. Therefore Peer Stelldinger, who is a PhD-student paid by the University of Hamburg, is working on a module which allows to learn and reason about spacial configurations of objects on the table.

Creating and Editing Ontologies, Steffen Maas

Steffen Maas is a student from the University of Rostock on an internship in the Cogvis project, working on ontology-based high-level interpretation tasks. Currently he is creating a common sense ontology, formulated in standardised knowledge representation languages, like OWL, DAML-Oil, etc.

Wei Du

Somboon Hongeng

Ji-Young Lim

Kasim Terziç

last modified: 05-Nov-2003