Date:
January 27, 2006
Time:
2:15 p.m.
Location:
Computer Science Bldg 2311
Speaker:
Margo Seltzer, Harvard University
Title:
Provenance-Aware Storage Systems
Abstract:
As our scientific, business, and archival functions become increasingly
dependent upon online data and processing, data provenance (or lineage)
becomes increasingly important. Scientific disciplines rely on data
provenance to validate their experimental results; archivists really on
provenance to prove or preserve an object's integrity; and businesses rely
on provenance to adhere to government regulations. In this talk I will
present an introduction to the problem of data provenance and how we are
tackling the challenges by constructing storage systems that are aware of
provenance, treat it as a first class storage object, and maintain and
generate it automatically wherever possible.
Speaker Bio:
Margo I. Seltzer is the Associate Dean for Computer Science and Engineering,
the Herchel Smith Professor of Computer Science and a Harvard College
Professor in the Division of Engineering and Applied Sciences at Harvard
University. Her research interests include file systems, databases, and
transaction processing systems. She is the author of several widely-used
software packages including database and transaction libraries and the
4.4BSD log-structured file system. Dr. Seltzer is also a founder and CTO of
Sleepycat Software, the makers of Berkeley DB. She is a Sloan Foundation
Fellow in Computer Science, a Bunting Fellow, and was the recipient of the
1996 Radcliffe Junior Faculty Fellowship the University of California
Microelectronics Scholarship. She is recognized as an outstanding teacher
and won the Phi Beta Kappa teaching award in 1996 and the Abrahmson Teaching
Award in 1999. Dr. Seltzer received an A.B. degree in Applied Mathematics
from Harvard/Radcliffe College in 1983 and a Ph. D. in Computer Science from
the University of California, Berkeley, in 1992.
|
Date:
February 3rd, 2006
Time:
2:15 p.m.
Location:
Computer Science Bldg 2311
Speaker:
Klaus Mueller, SUNY at Stony Brook
Title:
Semantics-Aware Visualization
Abstract:
Multi-modal and multi-resolution datasets have become ubiquitous in a
wide range of disciplines, such as science, engineering, medicine,
architecture, and even entertainment. There is a great demand to
meaningfully fuse these types of data and visualize them efficiently.
This requires the incorporation of data semantics into the visualization
pipeline. In this talk, I will describe our Semantic Lens framework. It
fuses and augments different types and resolutions of image data into
one composite representation, using texture grammar and texture
synthesis, and it provides a variety of zoom lenses for focus+context
GPU-accelerated viewing within the semantic context. The general theme
of this framwork has been implemented for both scientific and medical
visualization, and in a virtual reality application for large-scale
urban architecture.
|
Date:
February 10, 2006
Time:
2:15 p.m.
Location:
Computer Science Bldg 2311
Speaker:
Alex Borgida, Rutgers University
Title:
So what's so special about Description Logics?
Abstract:
The vision of a Semantic Web requires an "ontology language" to help
express the semantic content of web pages and services. The current W3C
proposal, OWL, is based on Description Logics (DLs). Such ontology
languages also appear to be highly useful for information integration.
Much of this talk is tutorial in nature, expounding on my prejudices of
what makes DLs good and bad. This part is illustrated with an example
showing how additional "semantic" information can be added to "syntactic"
service descriptions in CORBA IDL.
One obstacle to the Semantic Web vision is the acknowledged expectation
that there will be no single ontology, but rather a (growing) collection of
independently maintained ones. The last, more technical, part of the talk
will concern the notion of Distributed Description Logics, which we have
investigated with Luciano Serafini. A potential failing of most
distributed reasoner semantics is that the inconsitency of a single local
source "infects" the entire system, so that an expensive global consistency
check must always be made. We show how our simple notion of "semantic
holes" provides a solution to this problem.
Speaker Bio:
Alex Borgida holds a PhD degree from the University
of Toronto, and is a Professor of Computer Science at Rutgers
University, New Brunswick, NJ. His research is mainly concerned with
knowledge representation and its applications. He has published in a
variety of areas including Artificial Intelligence, Databases, and Software
Engineering. The main unifying thread of this work is a
belief in the importance of languages, which shape the way we
think of a problem (an unabashed Whorfian!), and the need to be
precise and logical about the semantics of such languages. Alex is
co-recipient of the most influential paper award of the 1994
International Conference on Software Engineering, and is proud to have
contributed to the design and implementation of the Classic
language/logic, which was used by AT&T as part of a system that
configured "billions of dollars' worth of equipment".
|
Date: February 17, 2006
Time: 12:30 p.m.
Location: Computer Science Bldg 2311
Speaker:
Yaron Caspi, Tel-Aviv University, Israel
Title:
Video Visualization - Beyond Pixels and Frames
Abstract:
Video data is represented by pixels and frames. This restricts the way it is captured, accessed and visualized. On one hand, visual information is distributed across all frames, and therefore, in order to depict the visual information, the entire video sequence must be viewed sequentially, frame by frame. On the other hand, important visual information is lost by the limited frame rate. Similarly in the spatial domain, sensor and optics limit the capturing process, while huge redundancy prevents an efficient visualization of information. In this talk I will show how to exceed both limitations of
capturing devices and of visual displays. In particular, how fusion of
information from multiple sources allows to exceed temporal and spatial
limitations, and how visualization of video data can benefit from importance ranking. I will describe a process that depicts the essence of video or animation, by embedding high dimensional data in low dimensional Euclidean space. I will also show how super-pixels (in contrast to pixels) contribute to the exploitation of temporal redundancy for the task of spatial segmentation of regions with high importance.
|
Date: February 24, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Julie Dorsey, Yale University
Title:
Digital Materials: Modeling the Appearance of the Everyday World
|
Date: March 3, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Amanda Stent, SUNY at Stony Brook
Title:
Adaptation in Spoken Dialog with Computers
Abstract:
In this talk I describe two experiments designed to address aspects of the
following question: How do humans adapt in conversation with computers?
The first experiment, a Wizard-of-Oz experiment, looks at
hyperarticulation, or clear speech, directed at computers after evidence
of misrecognition. The second experiment, which uses an implemented
spoken dialog system, looks at adaptation in word choice and initiative.
I then discuss how the results of these experiments will help us design
better behaviors into spoken dialog systems.
|
Date: March 8, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Darren Gergle, Carnegie-Mellon University
Title:
The Value of Shared Visual Information for Collaborative Task Performance
Abstract:
For several decades, researchers and engineers have struggled to develop
systems to support distance collaboration. The failure of many
collaborative technologies is due, in part, to a limited understanding
of how groups coordinate in collocated environments and how the
coordination mechanisms of face-to-face collaboration are impacted by
technology.
My research is building a theoretical understanding of the role shared
visual information plays in communication and collaboration. Visual
information provides evidence to support both situation awareness and
referential grounding. However, the effectiveness with which it can be
used for these purposes depends upon the features of the media (e.g.,
video refresh-rates) and the features of the task (e.g., the linguistic
complexity of the objects being discussed). At a theoretical level, my
research leads to an improved understanding of how features of tasks and
media, both alone and in combination, influence group coordination. At
an applied level, my work benefits developers by identifying the features of technologies that enable people to work remotely, and
provides guidelines for the development of new technologies to support
distance collaboration.
|
Date: March 9, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Richard Souvenir, Washington University in St. Louis
Title:
Faculty Candidate Talk
Abstract:
The field of manifold learning provides powerful tools for parameterizing
high-dimensional data points with a small number of parameters when this
data lies on or near some manifold. Images can be thought of as points in
a high-dimensional image space where each coordinate represents the
intensity value of a single pixel. Direct application of these manifold
learning techniques has been successful on simple image sets such as
handwriting data and a statue undergoing rigid motion. However, they tend
to fail in the case of natural image sets, even those that only vary due
to a single degree of freedom, such as a heart beating in MR data. This
talk presents a framework which allows for the parameterization of data
sets commonly seen in natural video. This framework specializes manifold
learning for images by using image distance metrics that correspond to
natural image variations and extends manifold learning to cyclic and
intersecting topologies that occur in diagnostic medical image
applications. To support such applications on cardiac MRI, the framework
directly integrates the manifold embedding to provide new and stronger
constraints for segmentation and tracking.
Speaker Bio:
Richard Souvenir is a doctoral candidate in the Department of Computer
Science and Engineering at Washington University in St. Louis. He
received the M.S. degree in Computer Science in 2003 and B.S. in Computer
Science and Biology in 2001, both from Washington University. His
research in computer vision focuses on discovering natural
parameterizations of video data and includes aspects of machine learning
and medical imaging. He is a recipient of the National Science Foundation
Graduate Research Fellowship.
|
Date: March 10, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
John A. Stankovic, University of Virginia
Title:
Self-Organizing Wireless Sensor Networks in Action
|
Date: March 13, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Colin Dewey, UC Berkeley
Title:
Whole-genome alignments and polytopes for comparative genomics
Abstract:
Whole-genome sequencing of many species has presented us with the
opportunity to deduce the evolutionary relationships between each and
every nucleotide. In this talk, I will present algorithms for this
problem, which is that of multiple whole-genome alignment. The
sensitivity of whole-genome alignments to parameter values can be
ascertained through the use of alignment polytopes, which will be
explained. I will also show how whole-genome alignments are used in
comparative genomics, including the identification of novel genes,
the location of micro-RNA targets, and the elucidation of cis-
regulatory element and splicing signal evolution.
Speaker Bio:
Colin Dewey was an undergraduate at the University of California,
Berkeley, where he majored in Electrical Engineering and Computer
Sciences with an honors breadth area in Molecular Biology. After
receiving his B.S. with high honors in 2001, he continued on as a
graduate student at Berkeley under the guidance of Lior Pachter. He
will receive his Ph.D. in Electrical Engineering and Computer
Sciences with a Designated Emphasis in Computational and Genomic
Biology in May 2006.
Driven by his interests in molecular evolution and algorithm design,
Colin has focused his graduate research on the development of
algorithms for comparing multiple whole genome sequences. He has
participated in the international sequencing projects for the mouse,
rat, and chicken genomes and is currently a member of the ENCODE
Consortium, which aims to construct a catalog of all functional
elements in the human genome. He has also collaborated with
scientists at the National Center for Biotechnology Information.
|
Date: March 15, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Xifeng Yang, Univesrity of Illinois at Urbana-Champaign
Title:
Mining and searching massive graph databases
Abstract:
Graphs are ubiquitous with critical applications in domains ranging from
software engineering to computational biology. However, it is challenging to
analyze any reasonably large collection of graphs due to its high
computational complexity. Development of scalable methods for the analysis
of massive graph databases thus becomes one major thrust in data mining and
database research.
In the core of many graph-related applications, there are two fundamental
problems: how to mine graph patterns and how to process graph queries. My
initial study grasped a tight connection between these two seemingly
parallel areas. In this talk, I will first present a novel graph canonical
labeling system that is able to speedup the discovery of frequent subgraph
patterns. Next, discriminative pattern analysis is introduced for
constructing compact yet high-quality graph indices, which are then applied
to exact and approximate graph search in large graph databases. Such index
mechanism is shown to be very effective in processing graph queries. The
finding of graph pattern-based indexing is profound and yet to be fully
explored since the same concept can also be applied to graph classification
and clustering. In the end of my talk, I am going to examine broader
applications of graph patterns, such as biological network analysis for
functional annotation and program flow classification for software bug
isolation.
|
Date: March 17, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Jennifer L. Wong, University of California, Los Angeles
Title:
Design of Embedded Systems using Data-driven Statistical Techniques
Abstract:
A large variety of applications in pervasive computing, sensor
networks, and microbiological systems are driven by the capabilities
of modern embedded systems. These systems have posed a number of
exciting research challenges due to recent technology and
application trends. The design and analysis of embedded systems are
difficult due to the presence of variability and limited
predictability. For example, links in wireless ad-hoc networks an
often intermittent and properties of integrated circuits are subject
to manufacturing variability. I will address one of the challenges
of low-power wireless embedded systems: deployment for enabling more
energy-efficient communication. The basic premise of my work is to
use collected data from traces of deployed or operational systems to
drive both the analysis and realization of the embedded system and
that any proposed optimization must be demonstrated on actually
operational systems. My analysis leverages non-parametric
statistical techniques and identifies design insights, tractable and
accurate optimization objectives, and serves a basis for fast and
realistic simulation.
Specifically, I will demonstrate the importance and effectiveness of
fast on-line lossy link prediction. Then apply the link metric for
use in a force-directed optimization framework for addressing the
network deployment problem by mapping the location discovery problem
into communication space where the distance between two nodes is
function of the cost of communication between the nodes. By using
Delaunay tessellations, I have developed approaches for network
deployment, node relocation, and other related deployment problems.
I evaluated the effectiveness of both the link model as well as the
optimization approach in several testbeds and for several typical
tasks such as peer-to-peer communication and data collection.
Speaker Bio:
Jennifer L. Wong is a PhD candidate at the University of California,
Los Angeles. She also received her B.S. in Computer Engineering and
M.S. in Computer Science from UCLA. Her main research interests
include statistical modeling for the design of embedded systems,
low-power design optimization, sensor networks, and intellectual
property protection.
|
Date: March 20, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Bo Pang
Title:
A sentimental education: machine learning problems using opinion-oriented text
Abstract:
Sentiment analysis seeks to identify the viewpoint(s) underlying a text
span. Potential applications include question-answering systems that
address opinions as opposed to facts and business intelligence systems
that analyze user feedback. The research issues raised by such
applications are often quite challenging compared to fact-based
analysis. These challenges, together with large amounts of
opinion-oriented text available through the web, make this area an
excellent testing ground for interesting ideas in natural language
processing, as well as machine learning, data mining, and theory. In
this talk, we illustrate the new challenges and opportunities with two
sentiment analysis tasks. In particular, we will describe how we modeled different types of relations, which can have implications outside this area as well.
One task that has attracted a great deal of attention is polarity
classification, e.g., classifying a movie review as ``thumbs up'' or
``thumbs down'' from textual information alone. We considered a number
of approaches, including one that applies text categorization techniques
to just the subjective portions of the document. Extracting these
portions can be a hard problem itself; we describe an approach based on
efficient techniques for finding minimum cuts in graphs that incorporate
sentence-level relations. Another task, which can be viewed as a
non-standard multi-class classification task, is the rating-inference
problem, where one must determine the reviewer's evaluation with respect
to a multi-point scale (e.g. one to five "stars"). We apply a
meta-algorithm, based on a metric labeling formulation of the problem,
that explicitly exploits relations between classes. We show that the
meta-algorithm can provide significant improvements over both multi-class
and regression versions of SVMs when we employ a novel similarity measure
appropriate to the problem.
Portions of this work are joint with Lillian Lee and Shivakumar
Vaithyanathan.
|
Date: March 21, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Fei Sha, UPenn
Title:
Novel machine learning algorithms for sequential labeling problems
Abstract:
In this talk, I will describe novel machine learning algorithms that
have been successfully applied to problems in speech recognition and
natural language processing. The inputs to the algorithms are a
sequence of examples (such as words, or acoustic observations); the
outputs of the algorithms are a sequence of labels, one for each
example. What makes these sequential labeling problems interesting
and challenging is that the labeling decision for each example
depends on decisions on other examples.
For instance, in automatic speech recognition, the examples
correspond to overlapping segments of the speech signal. To recognize
the words in the speech, we need to label each segments by phonemes.
It is beneficial to explore contextual information by making "joint"
labeling decisions on adjacent segments as well. This approach often
leads to robust performance of speech recognizers.
Our new statistical learning algorithms build on the strengths of
conventional sequential labeling paradigms (such as hidden Markov
models) and the recent advances in statistical learning theory and
convex optimization techniques. The new approaches have yielded state-
of-the-art results on well-benchmarked problems in speech recognition
and natural language processing, for instance, recognizing sequences
of phonemes in utterances and identifying sequences of noun phrases
in the text.
Sequential labeling problems are ubiquitous and arise in many
domains. In the end of talk, I will discuss the possibility of
extending and applying these algorithms to other types of sequential
labeling problems, for example, recognizing human activities in a
video sequence.
Speaker Bio:
Fei Sha is a Ph.D candidate at University of Pennsylvania in
Dept. of Computer and Information Science. His primary research
interests are machine learning and its application to speech,
language processing, vision and robotics. He has worked on problems
arising from pattern classification as well as exploratory analysis
and visualization of high-dimensional data. As a coauthor, he won the
Best Student Paper Award at ICML-2004 for his work on manifold
learning. He holds B.Sc. and M.Sc. in Biomedical Engineering from
Southeast University, China.
|
Date: March 22, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Dr. Joseph J. LaViola Jr., Brown University
Title:
Mathematical Sketching: A New Approach for Creating and
Exploring Dynamic Illustrations
Abstract:
Diagrams and illustrations are often used to help explain mathematical
concepts. They are commonplace in mathematical and physics textbooks
providing a form of physical intuition to otherwise abstract
principles. Similarly, students often use pencil and paper to create
diagrams for math problems as an intuitive aid in visualizing
relationships between variables, constants, and functions. However,
such static diagrams generally assist only in the initial formulation
of mathematical expressions but not in the ``debugging'' or analysis
of those expressions which can be a severe visualization limitation
even for simple problems.
To overcome these limitations I developed mathematical sketching, a
novel approach to rapidly interacting with and visualizing
mathematical concepts through the fluid association of handwritten
mathematical notation and free-formed diagrams. Mathematical
sketching derives from the familiar pencil-and-paper process of
drawing supporting diagrams to facilitate the formulation of
mathematical expressions; however, with a mathematical sketch, users
can also leverage their physical intuition by watching their
hand-drawn diagrams animate in response to continuous or discrete
parameter changes in their written formulas. In this talk, I will
discuss the critical components of mathematical sketching and present
the results of an initial user evaluation in the context of a
prototype application called MathPad^2.
Speaker Bio:
Joseph J. LaViola Jr. is currently a postdoctoral research associate
in Computer Science at Brown University. He works under the direction
of Andries van Dam and Robert Zeleznik in the Computer Graphics
Group. His primary research interests include pen-based interactive
computing, 3D interaction techniques, predictive motion tracking,
multimodal interaction in virtual environments, and user interface
evaluation. His work has appeared in journals such as Presence and
IEEE Computer Graphics & Applications, and he has presented research
at conferences including ACM SIGGRAPH, the ACM Symposium on
Interactive 3D Graphics, IEEE Virtual Reality, and Eurographics
Virtual Environments. He has also co-authored "3D User Interfaces:
Theory and Practice", the first comprehensive book on 3D user
interfaces. Joseph received a Sc.M. in Computer Science in 2000, a
Sc.M. in Applied Mathematics in 2001, and a Ph.D. in Computer Science
in 2005 from Brown University.
|
Date: March 27, 2006
Time: 2:15 p.m.
Location: Computer Science Bldg 2311
Speaker:
Andrew Ladd, Rice University
Title:
A Novel Approach to Planning for Physical Systems
Abstract:
Over the last decade, motion planning algorithms have been used to
solve complex geometric problems and have contributed to advances in
industrial automation, service robots and computer-assisted design of
mechanisms. However, some of the most exciting applications for motion
planning, such as surgical robots, humanoid robots and autonomous
exploration vehicles are beyond the limits of current planning
techniques. The fundamental reason for this gap is that motion
planning algorithms typically do not explicitly consider the physics
of robot motion. In contrast, predictive models for mechanical
systems have become quite good. There are now many accurate and
efficient software simulation software packages available and a
corresponding need for planning techniques capable of leveraging them.
This talk will discuss recent work that has led to a novel algorithm
that seamlessly combines geometry and physics. The geometry aspect of
the problem is addressed with a combination of sampling and
subdivision methods. The implementation uses a general purpose
physical simulator for Stewart-Trinkle rigid body dynamics to model
contact, friction and arbitrary kinematic constraints. The
effectiveness of the planner has been demonstrated in a variety of
studies including lifting a heavy weight with an articulated limb and
driving a realistic car. Some theoretical aspects of the planner will
be discussed as well as its implications to robotics, graphics,
artificial intelligence and, more generally, our capability to compute
in the physical world.
Speaker Bio:
Andrew Ladd is finishing his Ph.D. at Rice University. He is
interested in algorithmic robotics which broadly studies how computers
reason about physical systems. His work combines theoretical
foundations with engineering practice.
|