|
A General Framework for
James Kuffner, Jr. |
This is the control loop of the robot. Given a program of tasks, the robot utilizes the cycle of sensing, planning, and acting, in an effort to accomplish its goals. If the robot has been well-designed and programmed correctly, it will behave intelligently as it goes about performing its tasks. |
We consider the problem of creating motion for an animated agent as equivalent to that of building and controlling a virtual robot. Instead of operating in the physical world, an animated agent operates in a virtual world, and employs a "virtual control loop" as follows:
The animated agent has a general set of control inputs which define its motion. These may be values for joint variables that are specified explicitly, or a set of forces and torques that are given as input to a physically-based simulation of the character's body. The animated agent also has a set of "virtual sensors" from which it obtains information about the virtual environment. Based on this information and the agent's current tasks and internal state, appropriate values for its control inputs are computed.
The advantages that a virtual robot has over a physical robot are numerous. While a physical robot must contend with the problems of uncertainty and errors in sensing and control, a virtual robot enjoys "perfect" control and sensing. This means that it should be easier to design an animated agent that behaves intelligently, than a physical agent that does so. In fact, creating an intelligent autonomous animated agent can be viewed as a necessary step along the way towards creating an intelligent autonomous physical robot. If we cannot create a robot that behaves intelligently in simulation, how can we expect to create one in the real world? Thus, adopting a virtual robot approach implies that the scope of this research goes beyond just computer animation, but also extends into the realm of artificial intelligence research.
If every task was very narrowly and explicitly defined, one could imagine simply maintaining a vast library of pre-computed, captured, or hand-animated motions to accomplish every possible task. The reality is that, in general, tasks cannot be so narrowly defined, lest the set of possible tasks become infinite. Instead, tasks are specified at a high-level, and apply to a general class of situations (e.g. "walk to the kitchen", "open the refrigerator", "take out the milk", "pour a glass", "sit down at the table", "take a drink", "wave hello to Mr. Smith", etc.)
The following sections describe each of the fundamental software components identified above, and briefly indicates how they might be utilized for synthesizing motion. We believe that no single component can provide a general solution to the motion synthesis problem, but rather each technique in combination with one or more of the others may provide a viable approach to generating animation for a given set of tasks.
Large libraries of clip motions can potentially become a powerful
resource for animation. Animating a character in a given situation
might ultimately involve selecting a pre-recorded motion from a vast
dictionary indexed by task or motion characteristics. For example,
there may be hundreds of walking motions stored, from among which a
character might utilize a
The primary drawback to clip motions is that any given data set can
usually only be used in a very specific set of situations. For
example, consider a captured motion of a human character opening a
refrigerator and taking out a carton of milk. The motion will only
appear perfect if the virtual model of the character, the
refrigerator, the carton, and their relative positions match the
actual objects used when the motion was captured. What happens if the
carton is placed on the lower shelf instead of the upper shelf? What
if the refrigerator model is made larger, or the location of the
handle on the refrigerator door changes? What happens if a bottle of
orange juice is placed in front of the milk carton? Motion warping,
interpolation, or spacetime constraint techniques exist that attempt
to adapt clip motions to a broader class of situations while enforcing
kinematic and/or dynamic constraints. However, larger deviations can
often cause such adapted motions to lose their visual realism, and
hence their aesthetic quality. Despite these drawbacks, clip motions
will continue to play in important role in real-time animation systems
due to their efficiency and visual realism.
The primary challenge when using motion planning to generate motion is
to achieve visual realism. Aesthetics of motion are of little concern
for robots, but are vitally important for animated characters. The
computed motion must look natural and realistic. It may be possible
to encode aesthetics as search criteria to use during planning, or to
perform post-processing on the planned motion. For example, the
naturalness and realism of a planned motion could arise from an
underlying physically-based model that guides the search.
Alternatively, search criteria might be ultimately derived from clip
motion libraries that represent a particular "style" of motion. Many
possibilities exist, but clearly motion planning is not useful for
tasks where few obstacles to motion exist and/or aesthetics are
extremely important (e.g. facial animation).
Sensory information can be encoded at both a low level and a high
level and utilized by high-level decision-making processes of the
animated agent. Examples of sensory encodings include "all objects
that are currently visible", "all other characters that are currently
nearby", or "sounds that can be currently heard". Because animated
agents operate in virtual environments, they can avoid many of the
problems that physical agents (robots) have when dealing with sensory
information (e.g. noisy data, conflicting data, etc.) Thus, it should
be much easier to build an intelligent virtual robot as opposed
to an intelligent physical robot. In any case, incorporating
some kind of sensory feedback will be necessary to achieve believable
behavior.
Animation generated using physically-based models (dynamic simulation)
has the advantage of exhibiting a very high level of realism.
However, since the underlying motion is dictated by physics, it is
difficult to control the simulation at a task-level. Spacetime
constraint optimization techniques can alleviate some of these
difficulties, but at a computational cost that is largely prohibitive
for real-time animation systems.
Physically-based techniques are very well-suited for generating
non-intentional (secondary) motions. Examples include the
animation of hair, clothing, wind, water, smoke, fire, or falling
objects. However, it is more difficult to apply such techniques to
the animation of intentional (primary) motions, using a
physically-based model of a character. Fundamentally, the key
difficulty lies in computing the required controls necessary to
achieve a particular task. However, this may be another area in which
libraries of captured motions might be useful. One can envision using
the "inverse dynamics" of a given physically-based model in order to
compute the set of controls necessary to achieve a particular captured
motion. Ultimately, libraries of "canned motions" may
eventually be replaced by libraries of "canned sets of
controls" that can be used in combination with a physically-based
model of a character.
Clearly, as the computational resources available to desktop systems
grows, increasingly sophisticated physically-based models can be used
in a variety of ways in order to generate increasingly realistic
animations.
Motion Planning
Motion planning algorithms were initially developed in the context of
robotic systems. Such algorithms generate motion given a high-level
goal and a geometric description of the objects involved. In the
context of computer animation, motion planning can be used to compute
collision-free motions to accomplish high-level navigation or object
manipulation tasks. Motion planning is particularly suited to such
tasks, since there is a near infinite number of possible goal
locations and obstacle arrangements in the environment. Flexible and
efficient algorithms can be designed to compute collision-free motions
towards a given goal location.
Simulated Sensing
Creating an autonomous animated agent with believable behavior in an
interactive virtual environment will ultimately require some kind of
simulated sensing. This can include one or more of simulated visual,
aural, olfactory, or tactile sensing. The basic idea is to more
realistically model the flow of information from the virtual
environment to the character. The character should act and react
according to what it perceives.
Physically-Based Simulation
All motions in the physical world are driven by the laws of physics.
Motions in virtual worlds typically aspire to give the appearance that
they are also driven by the laws of physics. Graphical models
simulate the visual appearance of objects, while physical models
simulate their behavior in the physical world.
1997 - 2009 © James Kuffner, Jr.