Wednesday, December 31, 2008

Brainpower Labs LLC

I decided to go ahead and start a company. I was planning on doing this eventually after I finish my PhD, but it seems like a good idea just to get the ball rolling now. The plan is to commercialize my AI research.

Back in 2004 and 2005 I just assumed that my research would always be open source. I released my master's thesis implementation as the open source project Verve. I posted on this blog in detail about every idea and simulation result. Monetizing this work didn't appeal to me; I was just happy to be able to work on something as grandiose as general AI.

I slowly came to realize that after graduation, I won't be able to sustain my progress without getting a day job. I really want to continue this research full-time, and I really do not want to divide my attention in order to make an income. So what are my options?

I worked on the Netflix Prize for a while (and still do periodically)... a quick $1,000,000 would be a great source of initial funding. (My best result so far is 0.5% better than Netflix's own algorithm, but I need to reach 10% for the win.) Going for such a big prize is a lot of fun, but I'm not betting on it. I still need something a little more predictable.

When the iPhone SDK was announced, I didn't really consider iPhone app development. It seemed like a lot of work and a big distraction from research. But when the iPhone App Store was launched, I started watching the list of top paid apps. Many of them are very simple but seem to sell pretty well. I started thinking more seriously about starting an LLC and building a few simple apps. It would be a good way to get my feet wet in the free market (learning what people want, as Paul Graham would say), plus I would learn the iPhone SDK and be able to use it for other projects. That's not to mention the psychological benefits of making things more concrete... feeling part of the real world and less like a graduate student.

So in October I decided to pull the trigger and formed Brainpower Labs LLC. Our first product, iBonsai, is definitely not AI-related, but it has been a good first app for me to learn the iPhone SDK, OpenGL ES, and the iPhone's performance limitations. (The thing is very impressive for a handheld device, by the way.) It's a little distracting right now switching between AI research and iPhone app development, but I'm hoping the two efforts converge at some point. I'm thinking our next app will be based around some simple AI techniques.

I'm really glad to have started the company now rather than after graduation. Besides having a new source of income (fingers crossed) to ease the transition out of grad school, I now have an immediate outlet for turning research ideas into real applications. As my core intelligence architecture progresses (which I've dubbed the Sapience Engine), I'll be able to use it to produce increasingly interesting applications, which could be iPhone apps, desktop computer software, console video games, or robotics applications.

Monday, December 29, 2008

iBonsai Version 1.0 for iPhone and iPod touch

UPDATE (1/5/09): iBonsai was featured on gizmodo.com and iphoneappreviews.net.

I just finished iBonsai, a new app for the iPhone and iPod touch, which is now available on the App Store:


For more info about this app and my long-term business plans, visit my new company website.

iBonsai Description
Bonsai is the Japanese art of miniaturizing trees by growing them in small pots. Now you can create your own 3D miniature trees right on your iPhone or iPod!

With a tap of your finger, iBonsai's sophisticated generative algorithm begins growing a unique digital tree. No two bonsai trees are the same! After about 30 seconds of growth, your mature bonsai becomes a beautifully rendered image in the sumi-e style of Japanese brush painting.

Enjoy the zen-like relaxing nature of this ancient art form.

FEATURES:
- Simple, clean interface.
- Interactive 3D view. Rotate/zoom to see your trees from all angles.
- Many different leaf types: japanese maple, flowering dogwood, and more (even a rare money tree...).
- Shake your iPhone/iPod to scatter leaves!
- Save images of your favorite trees, then use them for your background.
- Optional gravity-based viewing mode makes the tree appear to float in space.
- Advanced generative algorithm and random number generator give you totally unique results every time. Produce virtually infinite trees!

Tuesday, November 25, 2008

Progression of Intelligent Processes

The purpose of this article is to discuss the possible roots of intelligence in the universe. As with most fuzzy concepts like "intelligence," we must begin by producing several basic formal definitions which we can then use as tools to build more powerful concepts. We attempt to use these definitions to classify intelligent processes and their advancements over time. The result of this thought process is the idea that an intelligent process tends to produce other intelligent processes with goals that also fulfill the goals of the original intelligent process.

Definitions

System: any collection of interacting components.

State: a complete description of all components of a system.

Process: a sequence of changes to the state a system.

What's the difference between a "random" and "non-random" process? It depends on the level of detail of the analysis. If we analyze a complex process operating on a large system with a crude level of detail, representing it with too few variables, then we lose the ability to model all predictable effects. Thus, we must quantify the results with a certain degree of uncertainty/randomness simply because our model lacks enough detail. However, given infinite resources used to analyze processes, any process becomes non-random.

Stability: a property of the state of a system, proportional to its ability to remain relatively unchanged over time when perturbed only by a completely random process.

A process can change a system from a stable state to an unstable one, or vice versa. It can move a system through a trajectory of stable states. It can even, in a way, give birth to other processes: it can lead a system to a state where other "child" processes are able to act on it as well.

Goal: a target state of a system.

Intelligence: a property of a process, proportional to its ability to move a system closer to a pre-defined goal state.

Goals are defined by outside observers. For any particular analysis of any particular process, we can define whatever goals we want and determine the degree of intelligence present. Thus, intelligence is in a sense arbitrarily defined, but it can be a useful measurement as long as the goals are well-specified. We can say that a process has zero intelligence if: 1) it has no goals defined, or 2) it is completely unable to move a system towards a defined goal.

Now let's take a step back and think about our universe as a whole. The universe is defined by its initial conditions and its fundamental forces: the initial conditions specify the initial state of all systems, and the fundamental forces constrain the possible processes that can act on those systems. Interestingly, there seems to be a universal process which tends to favor systems in stable states. Imagine the state of any system as a point on a surface (or landscape), where the lowest points are more stable than the peaks. This universal process forces all systems down the slopes of their stability surfaces towards the (locally) minimum points. If the minimum is bowl-shaped, the system will stop changing when it reaches that minimum. (The minimum might be a valley, though, so there can be room for the system to wander around its state space within the valley and remain stable. This is a key requirement for a process to birth another process: when the parent process succeeds in reaching the minimum, the child process can begin to explore the valley.) Systems might go through unstable states temporarily, but they will tend towards the most stable. So what is this universal process which favors stability?

Evolution: a process which tends to change the state of a system to increase its stability, i.e. an intelligent process whose pre-defined goal state is the one with maximum stability.

The universal process described above seems to select the most stable states, allowing them to last, while discarding the unstable ones. We can view it as a type of evolution, "physical evolution," the first intelligent process within the universe.

Brain: a system which supports a specific variety of intelligent process. This process represents a transformation from input space (sensory) to output space (motor). Brains are produced by some other intelligent process which specifies their goal states.

Note that evolutionary and brain-based intelligence are processes, and remember that processes can give birth to other processes. In general, processes with a high degree of intelligence tend to lead to the existence of other intelligent processes. A single parent intelligence can even produce an entire family tree of intelligent processes. Furthermore, the child processes tend to be given goal states from a subset of the parent's goal states. A child process whose goals violate those of the parent will not last.

Progression

The following list describes the progression of intelligent systems in the universe. Each stage represents the result of some combination of events which produces a kind of intelligence phase change. There can be multiple intelligent processes at work simultaneously (e.g., the original physical evolution, its child "genetic evolution,", etc.). We intermix the more significant advances of each process as they appear. The list is, of course, not exhaustive; it represents a very small sample of all the complex processes in existence. Since we must choose a small subset, we choose to focus on those events that are most interesting to humans, i.e. those involved in the generation of our own existence.

Note that the goals of each child intelligent process cannot oppose the goals of the parent process without being destroyed. Also note that a key ingredient to many forms of intelligence, including evolutionary processes and more advanced brain-based intelligence, is the random exposure to new situations which enables trial-and-error-based decision making.

Physical Evolution Level 0: Initial state of the universe
The "null" stage. The universe is in its initial state, determined by some unknown process, waiting for the fundamental forces to begin acting.

Physical Evolution Level 1: Clusters
As soon as the fundamental universal forces start acting, physical evolution appears. Gravity produces clusters of particles from the initial state of matter. Larger clusters have more pulling force than smaller ones; the large get larger, and the small get sucked into the larger ones. Eventually physical evolution produces its first result in achieving its goal of stability: the universe becomes a stable collection of clusters of matter separated by empty space.

Physical Evolution Level 2: Stable molecules
Once the cosmic-scale events have settled down, interesting things begin happening at the microscopic level. Atoms are constantly colliding, "trying out" new ideas for molecules. The molecules that last longer are more stable; if they stay around just a little bit longer than others, they will become more common. So we begin to see physical evolution performing a type of selection process on molecules. Those that are most stable proliferate, and the unstable ones disappear. Each stable molecule flourishes in a certain habitat. There can be multiple stable "species" of molecules that coexist, possibly with symbiotic relationships.

Physical Evolution Level 3: Stable structures
Now that there are stable molecules available, physical evolution can operate on combinations of molecules. Random collisions of molecules produce all kinds of physical structures, some stable, and some not. Physical evolution again selects the more stable structures to proliferate, resulting in a new kind of battle for survival.

Physical Evolution Level 4: The cell wall
A molecular structure is produced that acts as a shield against bombardment: the lipid bilayer. Any collection of molecules with one of these protective shells (i.e. a crude "cell wall") gains a massive advantage over others in terms of mean lifespan. Their existence is still relatively short but much longer than before. In certain hospitable environments, these stable "cells" become common.

Physical Evolution Level 5: Birth of genetic evolution
With the protective cell wall in place, physical evolution can begin to experiment with various modifications to the internal cell structures. As before, any changes that increase the lifespan of the cell produce more long-term stability of that design. The game-changing event at this stage is the appearance of intra-cellular structures which support information-based representations of physical structures (primitive DNA). Physical evolution has given birth to a new intelligent process: genetic evolution.

Genetic Evolution Level 0: Gene expression
The presence of some molecule results in the production of some corresponding physical structure. This is the essence of gene expression. Any cell containing molecule X produces structure Y. Such a procedure is enabled by a combination of structures that acts as a gene expression machine. The details of the actual transformation (from X to Y) are unimportant as long as X reliably produces Y. Cell stability is still fundamentally tied to its physical properties, but with this gene expression machine in place, it is now indirectly tied to the presence of certain genetic molecules. Long-term survival under these new rules depends on having the right combination of these "genes." If having gene X implies having structure Y, and structure Y is related to a stronger cell wall, better repair mechanism, faster acquisition of resources, etc., then cells with gene X will become more common. Essentially, this new intelligent process operates in gene space, which is just a proxy for physical space. Genetic mutations, random events that modify a cell's genetic material, are an essential part of exploring this new gene space. The goals of the new genetic evolution intelligence (proliferation of stable genes) still fall within the constraints of its parent's goals (proliferation of stable structures).

Genetic Evolution Level 1: Cell replication
When cells develop the machinery to copy their own genetic information, they become able to copy their physical structures as well. This produces an explosion in the speed at which genetic evolution can operate. The "fitness" (relative long-term stability) of a given genetic solution is multiplied by the number of instances of that solution: a large population of short-lived physical entities is now more stable than a single long-lived non-replicating entity. The feedback loop (successful genes produces more numerous, stable physical populations, which generate more copies of those genes, and so on) means that systems with replication quickly outnumber non-replicating systems.

Genetic Evolution Level 2: Motility
Moving around increases the probability of acquiring resources, resulting in an increased ability to build and repair structural elements. Motility is possible even without feedback-based control: simply moving around quickly without any particular target is much better than sitting still. One possible form of early motility is enabled by a basic type of short-term memory: the charging/discharging cycles of cell depolarization, tied to some physical deformation of the cell, produces a variety of repetitive motions, some of which tend to move the cell around in its environment.

Genetic Evolution Level 3: Brains
Targeted acquisition of resources based on sensation is a more advanced form of motility which usurps simple blind movement. When external events are allowed to affect cell depolarization (which drives motility), a feedback loop is present in the system. This presents a new domain for genetic evolution: the control system between sensation and action - the brain. Brains are defined by genes, just like all other structures in the system, so genetic mutations changes can can act upon the parameters of these controllers. Genetic changes are favored that improve the control system in a way that results in more copies of those genes. We consider the transformation process (inputs to outputs) performed by the brain as a child intelligent process of its parent, genetic evolution. The initial brain's implicit goals involve acquiring energy and materials, but can potentially involve anything needed by genetic evolution. Any changes in the brain's parameters are constrained to help achieve the goals of genetic evolution.

Brain Level 0: Simple feedback control
The simplest brains are basic feedback control mechanisms (e.g., P/PD/PID controllers) which transform some sensory input signal into an output control signal. Initial possible "behaviors" for entities evolved in different environments include chemotaxis (chemical gradients), thermotaxis (temperature gradients), phototaxis (light gradients), etc. These feedback-based behaviors provide much more resources for the entity, increasing its long-term stability over those without such skills.

Genetic Evolution Level 4: Sexual reproduction
Targeted motility enables sexual reproduction. The success of a gene may depend upon the presence of other genes. In general, more complex structures must be represented by larger chunks of genetic material. The evolution of an entity's genetic material is a slow process when based on asexual reproduction and mutations alone; furthermore, the emergence of complex structures from mutations of an individual genome is fairly improbable. However, the emergence of genetic crossover, the ability of two physical entities to exchange chunks of genetic information, dramatically increases the probability of producing more complex structures. This procedure represents a wormhole through gene space through which information can warp. The result is that genetic material present in two separate entities, which might produce simple structures in isolation, can combine synergistically to produce much more complex structures in the offspring entities. Genes are now favored that optimize the control system towards more sexual reproduction, e.g., producing an implicit goal in the physical entities of maximizing intercourse.

Genetic Evolution Level 5: Multi-celled entities
A collection of cells that functions together in close proximity provides a benefit to each cell involved. By "cooperating" in a sense, many cells can share the overhead costs of staying alive, amounting to a type of microscopic trade agreement. While sharing the burdens of life with nearby cells, they can start to specialize into different roles, increasing the variety of macro-scale multi-celled structures.

Brain Level 1: Discrete switching control
With the appearance of multi-celled entities, the brain can now consist of a network of specialized neural cells which communicate via synaptic transmission. Each neural cell represents a nonlinear transformation of inputs to outputs, and the collective activity of the neural network can be viewed as a dynamical system. Such dynamical systems can have many stable states. Thus, instead of using a single feedback-based controller, the brain has now evolved multiple discrete control systems ("reflexes"). Each one is used in a different situation (e.g., feeding, swimming, mating, fleeing). The "decision" of when to switch is still solely influenced by the current (or very recent) sensory information; when the entity is exposed to a certain situation (i.e. a certain pattern of sensory inputs), its brain switches to a different stable feedback loop (attractor state in the dynamical system).

Brain Level 2: Language
Language emerges when the brain evolves distinct behaviors based on sensory input patterns caused by other entities. The basic ability to communicate information among individuals has the potential to augment the individual's representation of the state of the world with key information that can improve decision making (e.g., the task of information acquisition can be shared among many individuals). This is a necessary step towards the beginning of cross-generational information transfer, or "culture."

Brain Level 3: Structural learning
Previously, learning was possible based only on very short-term effects (e.g., cell membrane voltage). Now, brains are able to store information indefinitely by making structural changes to themselves based on experiences. This provides the brain with the ability to make decisions (about which action to perform next) based on current sensory information AND past experiences. Individuals can learn from mistakes. This takes some of the burden off genetic evolution; instead of evolving entities whose genes are tuned for very specific environments, it can instead evolve entities whose brains have a certain degree of adaptability, making them successful in a wider variety of environments.

Brain Level 4: Explicit goal representation
Previously, the brain's goals were implicitly determined by genetic evolution (produce behaviors that help proliferate genes). Brains that did not meet these goals were wiped out. Now it is possible to represent goals explicitly in the brain via reward signals (e.g., dopamine). This new brain sub-component is a mapping from certain sensory input patterns ("reward states") to a scalar reward signal. When this reward value is high, the entity's current actions are reinforced. So any situation that increases the brain's reward level will reinforce the recent actions that led to it. This operates on the existing switching control system by adjusting the probabilities of switching to each individual action in any given situation. Interestingly, the specification of which sensory situations produce reward signals is arbitrary and can be defined genetically.

Genetic Evolution Level 6: Genetic determination of brain goal states
Now, instead of having to make complex, global changes to the brain in order to add new goals, genetic evolution can now just modify the mapping from sensory state to reward. The brain's explicit goal representation, which is genetically defined, provides a simple unified architecture for adding, deleting, and adjusting its goals.

Brain Level 5: Value learning
Adjusting the action switching system based solely on the current reward signal is not always ideal; sometimes it is important to take a series of actions that produce no reward in order to achieve a larger final reward. It is possible to circumvent this issue by learning an internal representation of the "value" of various situations. This "value function" (in mammals, the striatum/basal ganglia) is a mapping in the brain from the current sensory experience to an estimate of future reward. Now the action switching system can operate on the long-term expectation of rewards, not just immediate rewards, resulting in individuals which are able to achieve their goals much more effectively.

Brain Level 6: Motor automation
Repetitive motions are often represented as reflexes, but often it is important for an individual to learn novel motions during the course of a lifetime. A special brain structure (in mammals, the cerebellum) enables repetitive motions to be automated. Although the combination of value learning and action switching is a very flexible system, it can be wasteful for these well-learned motion sequences. This new brain structure provides a general-purpose action sequence memory that can offload these sequences, performing them at the appropriate time while allowing the action selection mechanism to focus on novel decisions.

Brain Level 7: Simple context representation
Any brain component that depends upon the state of the environment will perform better given an improved internal representation of the environment. Since the state of the external world cannot be accessed directly by these components, they are only as good as the brain's "mental model" of the world. So a specialized brain structure (in mammals, the archicortex/hippocampus) that can classify the state of the world into one of several distinct categories will help the brain represent value and choose actions more effectively.

Brain Level 8: Advanced context representation
Any further improvement of the "mental model" of the world is exceedingly valuable to decision making entities. An outgrowth of the initial simple pattern classifier appears (in mammals, the 6-layered cerebral cortex). This enhanced version extracts information content, computes degrees of belief about the world, and presents a summary in a simplified (linearly separable) form for use by other brain components like the value learning and action switching system.

Brain Level 9: Information-based rewards
A new explicit goal in the brain appears as a reward signal based on the information content provided by the entity's sensory inputs. The entity is thus intrinsically motivated to explore unexplored places and ideas. Before this new motivation, behaviors were mainly focused on survival and reproduction with little need for acquisition of new information. Now there is a strong drive to explore the world, simultaneously training and improving the brain's mental model. (An advanced context representation is useless without the motivation to fill it with information.)

Brain Level 10: General purpose working memory
All kinds of decisions can be improved by considering various action sequences before physically executing them. This requires the ability to simulate various state of the world within the internal representation, including sequences of actions and their expected consequences and rewards. This type of simulation is accomplished in a special short-term memory array (in mammals, the prefrontal cortex) that can be written to and read from. The read/write operations are extensions of the old action selection system: now, instead of being limited only to physical actions, the brain has acquired the mental actions "read from memory cell" and "write to memory cell."

Technological Evolution Level 0: First appearance
With the advent of working memory, the current combination of brain structures allows the construction and execution of extremely complex action sequences. This includes several unique new abilities. It is possible for individuals to make physical artifacts ("tools") from materials in the environment, enhancing the effectiveness of the body in various ways: extended reach, impact force, and leverage. Simultaneously, it provides an extension to the brain itself: the "tool-enhanced brain" has an extended long-term memory because it can record information in the environment rather than relying on brain-based memory. This greatly enhances the accuracy of cross-generational information transfer, which was first enabled by the onset of language. The accumulation of knowledge concerning advanced tool production results in a new intelligent process: technological evolution. The goal of this new evolutionary process is to produce artifacts that are the most stable in the space of the parent process's goals (i.e. the goals of the human brain), i.e. tools that provide the most benefit to humans.

Technological Evolution Level 1: Computing machines
The continual generation of new knowledge (driven by information-based rewards, collected/recorded/analyzed/organized with physical tools, and shared across multiple generations) enables the creation of increasingly complex physical artifacts. These artifacts are increasingly helpful to humans in achieving their goals (eating, socializing, reproducing, acquiring information, etc.), which support the goals of genetic evolution (proliferation of the most stable genes), which is confined by the simple goal of universal stability-based evolution. The evolution of technology operates at a scale much faster than genetic evolution, so it produces the equivalent of the next addition to the brain before genetic evolution has a chance. This product, the computing machine, is an extension to the most advanced area of the human brain, the prefrontal cortex. It allows the execution of arbitrary algorithms and simulations much more quickly than the prefrontal cortex itself, enabling humans to solve all kinds of symbolic problems more quickly and effectively.

Technological Evolution Level 2: Intelligent computing machines
Technological evolution continues to produce increasingly useful artifacts until a milestone is reached: an artifact with the same degree of intelligence and autonomy as the human, i.e. a human-level artificial intelligence. This artifact boosts the ability of humans to achieve their goals in an exponential way: machines continually design and build the next generation of machines, each better/faster/cheaper than the last. The artifact itself represents the next child intelligent process with goals defined by the parent intelligence (technological evolution), which could include anything that helps (or at least does not harm) its parent process in achieving its goals.

What's Next?
I won't try to speculate here about specifics, but it is expected that, barring some major catastrophe, the same overall process continues: intelligent processes tend to produce other intelligent processes which help achieve the goals of the parent process (or at least don't contradict them).

Lineage

The following is an abridged lineage of the intelligent processes described here, starting from the oldest ancestor:

1. Physical Evolution (goals: physical stability)
2. Genetic Evolution (goals: proliferation of genes)
3. Brains (goals: eat/sleep/avoid pain/socialize/reproduce/acquire information/etc.)
4. Technological Evolution (goals: help humans achieve their goals)
5. Intelligent Computing Machines (goals: arbitrarily defined by tech evolution)

(Not listed here are all kinds of cultural evolution, including language, music, the free market, etc. Each of these represents a separate branch of the intelligence tree, which, like the others, must not violate the goals of the parent intelligent process.)

Tuesday, October 21, 2008

Learning to Represent Patterns

For the past several months I have been revisiting the issue of sensory and motor representation. I had implemented some initial ideas at the end of 2006, but I hadn't taken the time to study things in depth. My goal here is to represent real-valued sensory and motor spaces as efficiently as possible with limited resources. For example, say we're talking about visual sensory data (i.e. pixel arrays) involving 100 pixels (10x10 image), and we only have the resources to represent the 28 most common visual patterns. If we want to represent that visual space efficiently, we have to move our 28 basis vectors around the 100-dimensional space so that the resulting vectors represent the 28 most common visual patterns. This all must be learned online (in real time) as the system is experiencing visual data. Then, after learning, each incoming image will be classified as one of those 28 "categories."

One approach is based on the standard statistical approach of maximum likelihood learning. We assume the basis vectors are the center points of Gaussian kernels, each with a corresponding variance. For each data sample, we compute the likelihood of seeing that sample. (Given our current model, i.e. the 28 Gaussian centers and variances, what's the likelihood that the data sample was "generated" by our model?) Maximum likelihood learning attempts to adjust the Gaussian kernel positions and sizes within the data space to maximize this likelihood value over all data samples. The end result should be the model that best represents the actual data distribution.

Another approach is based on information theory, specifically the mutual information between the "input" and "output" variables. (This idea is usually attributed to Ralph Linsker at IBM Research, who called it "infomax.") I like to think of it this way: the data samples are coming from a real-valued input variable, V. We want to classify those samples into a discrete number of classes which represent the discrete class variable C. Each Gaussian kernel represents one class in C. Now, for each sample v, we want to transmit the maximum amount of information to the output class variable. We can do this by maximizing the mutual info between V and C, given the constraint of limited resources (i.e. a finite number of Gaussian kernels).

How do we do this? There are several ways. The simplest way to get started is to do gradient ascent on the mutual info. (Take the derivative of the expression for mutual info between V and C with respect to the Gaussian kernels' parameters, then continually adjust those parameters to increase the mutual info.) However, this direct gradient-based approach is hard to derive for mutual info because it depends on terms that are difficult to estimate; also, the resulting learning rules are (in my experience) unstable. But in general, any learning rule can be used as long as it generates a model with two properties: maximal prior entropy and minimal posterior entropy. Before seeing each data sample, the prior distribution over C should be uniform (maximum entropy/uncertainty), ensuring that each kernel is utilized equally (i.e. we're not wasting resources). After seeing each sample, the posterior distribution over C should be totally peaked on one class/kernel, representing minimal entropy/uncertainty. This is true when the Gaussian kernels are distinct, not overlapping. Thus, for each data sample received, we're reducing uncertainty as much as possible, which is equivalent to transmitting the greatest amount of information.

So I've been working on an infomax algorithm to represent real-valued data vectors optimally with limited resources. The tricky thing is that the posterior probability calculations require an accurate estimation of the probability density at each data sample. But I think I have a good solution to all these issues. The resulting algorithm appears to be great at pattern classification (< 10% error on the classic Iris data set and < 4% error on a handwritten digits set). More importantly, it should be just what I need for the core of my sensory and motor cortex systems.

Thursday, September 25, 2008

Visualizing the Truth

Here's an idea I submitted to Google's 10 to the 100th Project:

8. Your idea's name (maximum 50 characters)
Visualizing the Truth

9. Please select a category that best describes your idea.
Everything else

10. What one sentence best describes your idea? (maximum 150 characters)
To improve decision making, we store knowledge as a massive Bayesian belief network, display it intuitively, and enable extensive what-if simulations.

11. Describe your idea in more depth. (maximum 300 words)
Our brains are amazing decision making devices.  For most problems involving a few variables, we can mentally simulate various outcomes before deciding, often with great results.  For larger problems we can also rely on instincts/emotions (i.e. pre-computed solutions based on lots of experience).  However, for the most complex issues, especially those critical decisions faced by government leaders, our human brains are not able to accumulate enough data or foresee enough outcomes.  The number of possibilities is too great.  (For example, banning DDT to save endangered birds seems reasonable, but then third-world farmers produce less food and many people die of hunger.  Ideally, we could predict such long-term consequences from the start.)

My idea is to augment human decision making with Bayesian networks in a way that scales with the exponential growth of computer hardware.  Bayesian networks enable us to represent knowledge intuitively as "beliefs about the world."  Arguably, they function similarly to the brain's neocortex.  Running on a large computer cluster, a massive Bayesian network could represent a repository of our society's knowledge.  Unlike text-based systems, the belief representation would be ideally suited for decision making.  This wiki-style belief network would be totally open to modifications by the global community (with abuse prevention) via a variety of input methods, including mobile devices.  Users could add variables/nodes, modify connections/probabilistic relationships or utility values, etc. based on their own experience.  (Software agents could offload some of this burden, e.g., by suggesting new connections based on inferred relationships.)  Crucially, the belief network could be viewed in visually beautiful ways (e.g., see IBM's Many Eyes project).  Users could perform extensive what-if scenarios.  The result would be a concise summary of our collective beliefs and a substrate for meaningful simulations.

12. What problem or issue does your idea address? (maximum 150 words)
The problem is that of making hugely important decisions by fundamentally limited decision making machines (i.e. the human brain).  My idea provides a way for us to avoid persuasive fallacies, to which our human brains are so susceptible, that have greatly hindered progress on large-scale decisions.  For example, as presidential candidates propose new policies during their campaigning, citizens could immediately run the ideas through the belief network and see the expected outcomes.  It would provide a simple, automatic way for the general public to cut through the BS.

13. If your idea were to become a reality, who would benefit the most and how? (maximum 150 words)
In general this technology could be used by anyone making difficult decisions.  However, government policy makers are one of the most important target users.  Members of congress could spend their time performing highly informative what-if scenarios more quickly and effectively than via spoken language.  They could see immediate answers to questions like, "If we enforce this new policy, how does that affect key economic and environmental variables?  How does it affect more distant variables, like diplomatic relationships?  What are the expected probabilities of these outcomes?  How should we make this decision in order to maximize expected utility given conflicting goals?"  ("Utility" values could be determined by voting.)  Imagine permanent client installations as centerpieces of congressional meeting places.  Thus, the beneficiaries would include all citizens of any country whose government decides to utilize this technology.

14. What are the initial steps required to get this idea off the ground? (maximum 150 words)
We would need to design, implement, and test a prototype.  This involves connecting a scalable Bayesian network software library (freely available) with a website displaying the belief network in an attractive manner.  The Bayesian network could be hosted on a scalable compute platform, like Amazon's EC2.  There should be a dead-simple (fun!) way to input new data via the website and mobile clients.  Similarly, users should be able to perform simulations by making (temporary, client-side only) changes to the network and watching the results propagate through the belief network.  Besides the basic technologies involved (Bayesian networks, hosting, and user input methods), the success of the project greatly depends on the visual attractiveness of the website and how much fun it is to use.  Thus, it is critical to involve skilled graphic designers and possibly video game developers in the implementation.

15. Describe the optimal outcome should your idea be selected and successfully implemented. How would you measure it? (maximum 150 words)
The ideal outcome is that when facing complicated decisions, the proposed system would change our default decision making behavior, similarly to how Google search has changed our default memory recall behavior.  The effect would be pervasive but undoubtedly very diffuse; measuring the effect would be difficult.  One simple way might be through a public poll, e.g., "When faced with difficult decisions, do you: A) think about it for a while, B) decide on instinct, C) get advice from friends, D) consult wikitruth.com/visualtruth.com/whatever-its-called.com?"  Ideally, using the system would be both fun (encouraging massive participation) and highly informative.  The result is that our most important decisions would be based on more than just a few mental simulations, more than even instincts or emotions, but on the collective knowledge of humanity.

Sunday, June 29, 2008

The Toilet Test for Machine Intelligence

A machine could be considered intelligent if it makes a person uncomfortable to use the toilet in its presence.

Monday, May 05, 2008

Xenopsychology

I just read this article, Xenopsychology, by Robert Freitas Jr. It asks the questions: What are alien minds like? What predictions can we make about alien goals, behaviors, and brains? What universals might exist in the psychology of creatures across the universe? What metrics do we have to compare different minds? "So far we have very little direct knowledge of alien minds -- but we have some fascinating bases for speculation."

What methods can we use to begin answering these questions? We can look at the diversity of brains on earth, studying their overall architectures and the evolutionary reasons those architectures exist. For example, one major division is that between ganglionic (decentralized network of independent sub-brains; less scalable) and chordate (central brain with many peripherals; very scalable) nervous systems. What are the benefits and drawbacks of these different types of brain design? Why did they emerge in the first place? Should we expect alien brains to fit one of these two patterns? Are there other options?

Concerning metrics, the author introduces the "sentience quotient" (SQ). "Generally the more information a brain can process in a shorter length of time, the more intelligent it can be." This is basically a measure of a brain's information processing efficiency. The exact definition is given as log(I / M), where I is the information processing rate in bits/second, and M is the mass of the machine. (Because I and M can cover very large ranges, the logarithm is used to focus on the ratio's order of magnitude.) So a small machine that can process quickly will have a higher SQ. The SQ ranges from -70 to +50. (Note that this definition only deals with processing speed, not solving problems/achieving goals. I like the more general definition of intelligence given by Shane Legg and Marcus Hutter: "Intelligence measures an agent’s ability to achieve goals in a wide range of environments."). The author suggests that there may be a "minimum SQ 'communication gap,' an intellectual distance beyond which no two entities can meaningfully converse." Just as rocks and trees are barely aware of our existence, let alone being able to communicate with us or understand our goals, we might find it difficult to communicate with or even sense the presence of an alien intelligence with an SQ 10 points above ours.

Here are some excerpts from the article:

"Will ETs be more or less emotionally motivated than humans? Will they have emotions foreign to us, or are there any universal emotions?"

"Extraterrestrial logicians may find many of our most enduring paradoxes to be trivially solvable, just as we may be able to resolve some of theirs equally effortlessly."

"Consciousness may be an emergent property of intelligence, a fortuitous feature of a terrestrial animal brain architecture originally designed for other jobs. Is it possible that there could exist yet higher-order emergents beyond consciousness?"

Wednesday, April 02, 2008

QuickMP 0.8.0 Released

I just released the first version of QuickMP, which is a very small (one header file) piece of C++ code to ease the burden of shared-memory programming. It's ideal for applications where you perform the same operations repeatedly on lots of data.

The basic idea is that you convert your main C++ for loop from something like this:
// Normal for loop uses only 1 thread.
for (int i = 0; i < 1000000; ++i)
{
processMyData(i);
}
...to something like this:
// Parallel for loop automatically uses 1 thread per processor.
QMP_PARALLEL_FOR(i, 0, 1000000)
processMyData(i);
QMP_END_PARALLEL_FOR

Tuesday, March 11, 2008

A Cheap Barcode System Could Tie the Internet to the Real World


Many people, including myself, are just starting to realize the potential of mobile barcode scanners. I'm not talking about using a mobile phone to generate barcodes for things like food orders (see this prototype of a Starbucks ordering system on the iPhone; thanks Tony for showing me this), although that mode of operation will also be revolutionary. I'm talking about using a mobile phone to scan barcodes on all kinds of objects in the physical world which provide a sort of hyperlink to more information about those objects, which is then displayed on the phone. Here are some examples to show how interesting this could be...

You could easily keep a written history of personal objects. You might find an old baseball bat in the attic, scan it, and read old entries, like, "May 5, 2012: Got this new bat for my birthday." Or, "July 12, 2015: This bat hit a winning homerun against the Tigers." You wouldn't have to keep a physical file cabinet full of records like this... they would be attached to the object itself.

The same principle would apply to public property. Imagine tying internet forums to specific places or objects. Forums usually assume a shared interest of some sort; in this case the thing you share is that you have experienced the same location or object. Sort of like bathroom stall graffiti, but everywhere... and hopefully more useful. Imagine walking through a park and sitting on a park bench. Embedded in the bench beneath a piece of glass is an inconspicuous little barcode. You scan it with your phone, which provides you with a few paragraphs about the park, the donor of the bench itself, etc.; sort of a Wikipedia entry. There are random text entries people have written, like, "August 21, 2010: It was rainy today, but I went for a walk anyway. Sat here for a half hour until it cleared up. Oh, be sure to smell the freesias about 30 ft NW of here," or, "November 7, 2011: I started reading A Brave New World here today. Has anyone read it? If not, check out the link. Man, Huxley was way ahead of his time." It also includes pictures people have taken nearby and "added" to the bench. And videos. Maybe even music recordings from the bandshell across the park. Each of these bits of info are tagged with dates, so the bench's entire history could be displayed as a timeline of local events. There might be problems, like people spamming the digital timelines of public property with advertisements. But we wouldn't have to hire people as spam filters to clean them up physically; all the data would be stored in the cloud, so there would be automatic spam filters.

Products you buy would no longer need paper instructions for assembly and usage. (Instead, you would get hyperlinked text, audio, and video instructions.) Nor would they need paper warranty statements. Even purchase information from paper receipts could be stored with a product's digital identity. When you buy a book, the store scans the book, gets your bank account info (by scanning your mobile barcode-producing device?), debits your account for the purchase, and transfers ownership from Barnes & Noble to you. Later, returning defective products is dead simple. Or, if you want to give the book to a friend, you perform a similar ritual to transfer ownership. So the object's digital identity could store the original purchase price, the current owner, and even the history of owners.

All kinds of physical objects could benefit from metadata. Easily keep car maintenance history with the car itself. Store your personal medical records on a bracelet. Stickers on produce would direct you to the harvest date, expected expiration date, the grower's location, and helpful recipes. Medication could provide personalized audio messages from your doctor and up-to-date warnings and recall notices. Musical instruments could hold audio recordings from previous owners. Paintings could link to more art from the artist. Power tools could give you tips on how to use them. The list of potential applications is really long.

Note that all of these things could be done without the barcode (e.g., keeping online documents which you can find via text search), but the barcode provides a context-relevant link to that data. Context is the key idea here: you can easily access the relevant info when and where it is needed. Replace (or augment) the searchable online file cabinet with hyperlinks directly from the physical objects.

Of course, other technologies could provide similar capabilities, but they're too expensive to be practical. Barcode systems are essentially available now, and they're so cheap, both in terms of the scanners (i.e. camera phones) and the individual barcode labels. They fit so easily into our existing infrastructure. You could print your own barcode labels at home and stick 'em on anything. Practically, all we need is a standard barcode format, barcode scanning software on our mobile phones, and free web hosting for all the metadata. Eventually such a system could evolve to include RFID tags, GPS devices, augmented reality displays, and Google Maps/Earth/Metaverse, but cheap barcodes could start laying the groundwork today.

Tuesday, February 26, 2008

GDC 2008

Ray Kurzweil keynote on Thursday

Shana and friends storming the expo floor

Last week I went to GDC 2008 in San Francisco. As usual, I was a volunteer ("conference associate") for the conference. This year (and last year) I avoided attending a lot of programming sessions and focused more on things like game design. My purpose in going to the conference this year was mainly to keep tabs on the industry as a whole and to get a feel for where things are headed. On the technical side, the general trend continues to be more cores, threads, and parallelism. Also, there is a lot of experimentation with funky new input methods; some of the coolest ones, in my opinion, are the brain-computer interface devices from NeuroSky and Emotiv. It seems easier than ever to be an independent game developer because there are so many game portals out there to host indie games, like Steam and Kongregate.

Monday
I went to the Independent Games Summit all day. One interesting discussion was on the topic of defining rewards for the player... for example, is it more rewarding for a game to have good graphics and visual style or interesting game mechanics? I think they both have their place: the visceral aspects of a game, including visuals and audio, can provide an immediate draw to the player, but eventually even the best-looking games can get boring. The game mechanics provide the long-term rewards and can keep a player interested. So, in my opinion, good graphics draws people in, and good gameplay keeps them there. Also, someone mentioned the balance between boring and overwhelming, which I think is crucial because it forms the basis of curiosity and "information rewards" in our brains.

Other sessions at the Independent Games Summit discussed a lot of real life issues for indie developers, including legal issues. A lot of this was in the form of post-mortem advice from developers talking about specific games they had made.

Tuesday
I went to the last half of the games and education keynote. It seemed well thought out and very encouraging, though I can't remember any of the details now. There was a Serious Games Summit panel session on measuring what players learn when they play games... how do we define such a metric? Is this metric even necessary or desirable? This was a thought-provoking talk, although no consensus was reached.

Wednesday
I helped with a session on Xbox Live Arcade and attended the Microsoft keynote on XNA development for Windows, Zune, and Xbox. I also helped with a panel discussion on art outsourcing.

In the expo Sony had a cool demo of a head tracking system which used a single camera to detect the head's position (2D or 3D? I couldn't tell) and orientation using only face and eye detection.

Natural Motion's Euphoria is being used in the upcoming games Star Wars: The Force Unleashed and Grand Theft Auto 4. These games should be a great demonstration of Natural Motion's dynamic character animation techniques in real time. Up till now the main benefit of their tools has been to ease the burden of animating characters for movies. Having their system run in real time interactive games will really show off their technology and will probably have a profound impact on gamers and the industry.

Wednesday evening during the awards ceremony I sat next to Michael Callahan, co-founder of Ambient Corporation. Ambient is developing a system called the Audeo, a wireless neckband that enables telepathic chat by silently translating vocal nerve signals into a synthesized voice. So you could use it to make silent phone calls or google queries in public places. I remember a NASA group working on something similar a few years ago, but Ambient is ready to commercialize it. Pretty cool.

Thursday
Ray Kurzweil gave a keynote talk, which was similar to most of his other talks, but it was exciting to me to be in a room with thousands of people hearing his central thesis for the first time.

I helped with a session for Emotiv (one of two companies providing an EEG-based game input device) on their new SDK. Their system detects facial movements, "emotional" states (relaxed, tense, etc.), and cognitive/intentional state (e.g., thinking about moving an object through space). A short training period is required for some modes. They have a really nice control panel for testing all aspects of the system. One interesting point they made was that they could map brain states to existing keystrokes, enabling immediate usage with existing games.

I attended an Intel session on threading options for multicore machines and another programming session on undefined behavior in C++.

The Game Design Challenge 2008 was The Interspecies Game. This is always really entertaining. My favorite of the three designs was Bac Attack, a real-time strategy game against a petri dish of bacteria.

Friday
I mainly did random volunteer jobs throughout the day. I attended one session on legal issues for game music composers (e.g., maintaining IP vs. work for hire). Friday evening the conference volunteers had their final meeting, which is a mix of feedback for next year's conference and prizes for the volunteers. I won a copy of Star Wars: The Force Unleashed, signed by all the developers. :)

At one point during Friday afternoon I was scanning people's badges at the entrance to the expo floor. A gentleman from the academic world needed help finding the Carnegie Mellon University booth (which turned out not to exist). As I was helping him find his destination we got into a brief discussion regarding the game industry as a whole. He made the comment, "I hope these games can someday be used for something important rather than just entertainment." My immediate response was something like, "But entertainment is very important. Children playing with toys are actively learning many skills that will be needed later in life. Entertainment doesn't appear useful because the benefits are not immediate." Besides the personal benefits, at a societal level entertainment is part of what creates culture. This is sort of the main thesis of Leisure: The Basis of Culture.

Later I thought more about this issue, and Shana helped convince me that entertainment is indeed different from other endeavors. It is important in a way that is different from how food and national security are important. Entertainment is important for rich life, but other things are important for life at all. I think Maslow's hierarchy of needs helps clarify the issue: basic needs (e.g., nutrition and safety) must be satisfied in order to support higher needs, which I believe includes entertainment and play behavior. All needs are important, in the sense that we all want to reach the top of the hierarchy of needs, but some must be satisfied in order to support others.

Thursday, January 10, 2008

Working on Action Selection

In October I wrote about developing a simulated human test application. I have yet to begin implementing this because I'm still working out some issues with the basal ganglia component, which performs the essential task of action selection. It seems to me that action selection is the core required function of any intelligent system (see the wikipedia entry here), so it must work correctly.

I have written several test applications that focus squarely on learning proper action selection, including a 2d arm (1 degree of freedom) that must learn to reach a goal angle, a basic n-armed bandit test, and a 2d mobile robot that must seek reward objects. Once I'm confident that my system can solve these tasks robustly, I will continue down the path towards a simulated human test. Although I really want to start experimenting with a human-like test platform now, I have to take things one step at a time... If I just throw together an integrated control system without first testing each part in isolation, it will be impossible to debug any problems that arise.

Wednesday, January 09, 2008