|
"who am i?" mappings
25 november 2003
A key
competency that a sociable computer must have is attaining a deep
understanding of its user. Unfortunately, progress in the person
modeling literature in artificial intelligence is slow and often
short-sighted. Two prominent approaches are behavior modeling and
demographic profiling. In behavior modeling, a person is represented
as a history of behaviors, i.e. what actions they took in the context of some application
domain. For example, intelligent tutoring systems track a person's
test performance (Sison & Shimura, 1998); and collaborative
filtering systems track user purchasing and browsing habits and
compare them with those of like-minded people to make predictions
about the user's attitudes (Shardanand & Maes, 1995). The chief
drawbacks of behavior modeling are that 1) knowledge of user action
sequences is generally only meaningful in the context of a particular
application, and 2) a statistical behavior model of a person is
just a vector of numbers, completely uninterpretable by itself.
A second approach to person modeling is demographic profiling, in
which gathered demographic information about a user is used to draw
generalized conclusions about user preferences and behavior. A drawback
of this approach is that demographic profiling tends to overgeneralize
people by the categories they fit into, and often requires additional
user action such as filling out a user profile. This is not to say
that generalizing about persons is a bad approach, because when
we mentally model other people, we, like computers, also overgeneralize. The difference is that people have much richer knowledge-based
and experience-based vocabularies for generalization than a demographic
profiling system acting on a sparse rule set with rules like: "age
18-25 --> likes britney spears."
Meanwhile,
an important aspect of a person has been overlooked: personal
identity. A person's identity is strongly correlated with their
beliefs, goals, and desires, which in turn drive their behavior.
A person's behaviors are often the predictable product of his/her
identity. Whereas behavior modeling views a person in terms of a
history of behaviors, using this history to predict future behaviors,
it would seem much more productive to model a person's source
of beliefs, goals, desires, and behaviors. Perhaps demographic
profiling hopes to do this, but its approach is weak. Demographic
categories do not have very much breadth or depth, and there is
no inherent empirical basis for any inferences made from these categories.
An emerging study in marketing research, called, psychographics (i.e. psychology+demographics), is taking a more promising direction
in trying to see how people are more realistically grouped in society
(e.g. "soccer moms" vs. "pickups and shotguns").
The intent of this research project is friendly to psychographics.
How
could we possibly hope to model something as grandiose as the totality
of a person's identity? The German sociologist Georg Simmel suggested
that identity is not monolithic, but rather fragmented, and dually
public and private in nature (1908). In addition, the public fragments
of a person's identity are determined by social roles. For example,
being a police officer confers something on my identity, as well
as being a dog-owner, or a student. As we begin to deconstruct identity,
we find that a compelling picture of a person can already be painted
by understanding the different social archetypes that a person can
be described by. There is also evidence from psychology about the
centricity of archetypes, i.e. as people begin to think
in terms of these linguistic signifiers, they naturally begin to
fashion themselves solely in terms of the signifiers available to
them, (Lacan, 1986) making knowledge of these archetypes even more
powerful predictors of people.
If
we can computationally characterize the beliefs, desires, and goals
of a large enough set of social archetypes (call this collection
of archetypes an identity map), and if we can partially
classify a person into a neighborhood in this identity map, then
we will find ourselves with a substantial model of a person's beliefs,
desires, goals, and behaviors. People certainly possess such models,
and employ this knowledge to characterize the identities of others
on the basis of social cues expressed and given off by these others.
(Goffman, 1959)
This
is the rationale behind the proposed "Who Am I?" (WAI)
project. The goal is for WAI to be able to characterize a person's
identity by analyzing his/her personal text, perhaps something on
the order of a person's homepage, or better yet, their weblog. The
outputted characterization of a person will consist of a list of
social archetypes a person is thought to belong to, along with associated
confidence scores.
On
of the major challenges is how WAI can acquire a substantial and
meaningful collection of social archetypes. Online special interest
groups provides a potential source of data. The Dmoz Open Directory
Project has clusters of personal web pages dedicated to various
interests (e.g. croquet), subcultures (e.g. raver), and professions
(e.g. nurses). There are 1400 clusters in all, each with between
15-200 personal pages dedicated to the topic. Within each cluster,
an automatic computer reader skims each website in the cluster extracts
simple beliefs, interests, disinterests, and goals. The extracted
information will likely take the form of concept-affective-valence-pairs,
e.g. ("Britney Spears", -100%), ("make money",
85%). These attitudes will be mined using methodology similar to
(Liu & Maes, in press). The beliefs, interest, disinterests,
and goals most common to each cluster are extracted. This would
form the "essence" of a social archetype. Of course, we
don't propose that each of the 1400 clusters will a very meaningful
archetype, because we want a less granular definition of archetype
than a single interest or subculture (although, in today's sick
sad materialistic culture, people often define themselves by laundry
lists of interests...see friendster if you need convincing). We
can classify each of these 1400 archetypes into meta-archetypes
which can be more meaningful. For example, the kayaking, bungee
jumping, and snowboarding archetypes can form a "thrill-seeker"
meta-archetype. We can either manually classify archetypes into
meta-archetypes (1400 isn't so bad), or, we can use machine learning
to learn these archetypes from online communities like weblog blogrings.
Having
built a map of social archetypes, WAI automatically analyzes a person
based on some personal text, like a homepage, or a friendster profile.
WAI generates its hypotheses about a person's current identity composition,
based on his/her current beliefs, interests, disinterests, and goals.
"Who
Am I?" would have innumerably many interesting applications.
Just bubbling what the computer's guess of a person's identity composition
directly into a visualization could be useful and entertaining reflective
feedback. I also envision computer applications which can use a
person's identity profile to drive customized interactions. Talk
to runners using journey metaphors, like "on the road to success".
Offer a different interface to people who are "wild" versus
"reserved", or "liberal" versus "conservative",
or "sporty" versus "bookwormish".
I propose
to implement a collection of computational models of social archetypes,
as described above, and an application for visualizing a person's
perceived identity-composition given a textual input like a homepage.
Time permitting, I will prototype an application to use WAI that
can tailor an interaction with a user based on knowledge about the
person's social identity composition.
Bibliography
Goffman,
Irving (1959). The Presentation of Self in Everyday Life: Introduction
Lacan, Jacques.
Seminar XI: Four Fundamental Concepts of Psychoanalysis. Trans.
Alan Sheridan. Penguin, 1986.
Liu, Hugo and
Maes, Pattie (in press). What Would They Think? A Computational
Model of Attitudes. To appear in Proceedings of the 2004 International
Conference on Intelligent User Interfaces, IUI 2004, January 13–16,
2004, Madeira, Funchal, Portugal. ACM 2004, ISBN 1-58113-815-6
Shardanand,
U. and Maes, P. (1995). Social information filtering: Algorithms
for automating "word of mouth", Proceedings of CHI'95,
210-217.
Simmel, Georg
(1908). How is Society Possible?
Sison, R. and
Shimura, M. (1998). Student modeling and machine learning. International
Journal of Artificial Intelligence in Education, 9:128-158. |