Abstract: Many learning problems in computer vision can be posed as structured prediction problems, where the input and output instances are structured objects such as trees, graphs or strings rather than, single labels {+1,-1} or scalars. Kernel methods such as Structured Support Vector Machines , Twin Gaussian Processes (TGP), Structured Gaussian Processes, and vector-valued Reproducing Kernel Hilbert Spaces (RKHS), offer powerful ways to perform learning and inference over these domains. Positive definite kernel functions allow us to quantitatively capture similarity between a pair of instances over these arbitrary domains. A poor choice of the kernel function, which decides the RKHS feature space, often results in poor performance. Automatic kernel selection methods have been developed, but have focused only on kernels on the input domain (i.e. 'one-way'). In this work, we propose a novel and efficient algorithm for learning kernel functions simultaneously, on both input and output domains. We introduce the idea of learning polynomial kernel transformations, and call this method Simultaneous Twin Kernel Learning (STKL). STKL can learn arbitrary, but continuous kernel functions, including 'one-way' kernel learning as a special case. We formulate this problem for learning covariances kernels of Twin Gaussian Processes. Our experimental evaluation using learned kernels on synthetic and several real-world datasets demonstrate consistent improvement in performance of TGP's. Related paper: Spolight video: - Link Paper Link: - Link
Abstract: To help gauge the health of coral reef ecosystems, we developed a prototype of an underwater camera module to automatically census reef fish populations. Recognition challenges include pose and lighting variations, complicated backgrounds, within-species color variations and within-family similarities among species. An open frame holds two cameras, LED lights, and two ‘background’ panels in an L-shaped configuration. High-resolution cameras send sequences of 300 synchronized image pairs at 10 fps to an on-shore PC. Approximately 200 sequences containing fish were recorded at the New York Aquarium’s Glover’s Reef exhibit. These contained eight ‘common’ species with 85–672 images, and eight ‘rare’ species with 5–27 images that were grouped into an ‘unknown/rare’ category for classification. Image pre-processing included background modeling and subtraction, and tracking of fish across frames for depth estimation, pose correction, scaling, and disambiguation of overlapping fish. Shape features were obtained from PCA analysis of perimeter points, color features from opponent color histograms, and ‘banding’ features from DCT of vertical projections. Images were classified to species using feedforward neural networks arranged in a three-level hierarchy in which errors remaining after each level are targeted by networks in the level below. Networks were trained and tested on independent image sets. Overall accuracy of species-specific identifications typically exceeded 96% across multiple training runs. A seaworthy version of our system will allow for population censuses with high temporal resolution, and therefore improved statistical power to detect trends. A network of such devices could provide an ‘early warning system’ for coral ecosystem collapse. Related paper: News: - Link Paper Link: - Link