Analogy making

Analogy

Disclaimer: in the short papaer bellow I discuss several articles, given during the course of analogy class. The tables are not included.

If deduction and induction are two playing teams, analogy is the stadium itself! It is the deepest cognitive ability!

“Analogy is a mapping of knowledge from one domain (the base) into another (the target), which conveys that a system of relations that holds among the base objects also holds among the target objects”

Analogy is mapping between deep, core structures, but not between attributes or at least not only between attributes.

According to this analysis, the contrast between analogy and literal

similarity is a continuum, not a dichotomy” –

As we see on the table above, in case of literal similarity we have many similarities between the attributes of the two compared objects and many similarities between the core structures.

In the case of abstraction we have the proper conditions for analogy mapping, but the objects themselves are unclear and vague, i.e. abstract

In case of anomaly we do not have many possible comparisons neither between the attributes, nor between the deep structures.

In case of the real analogy, we have few mappings between the attributes of the compared objects, but many mappings between the core relational structures.

Metaphor

The metaphor is widely spread comparison, most often pure analogical one, i.e. comparison between relational structures, but not between attributes. However, there are exceptions.

Systemalicity principle

“A predicate that belongs to a mappable system of mutually interconnecting relationships is more likely to be imported into the target than is an isolated predicate.”

People automatically map better and faster objects that have interconnected deep structural relations than other objects isolated from one another in their core relations.

Analogy is the mechanism that drives categorization and therefore cognition. Analogy making expands the personal concept. Basically, there is no fundamental difference between dealing with the simplest of concepts and the most complex after you have performed this categorization/chunking.

Analogy in chimpanzees

So, the chimp Sara succeeded with significant statistical results in all of the experiments described in the article. In my own opinion the only difference between humans and the most developed mammals is only the explicitly manifested logic and it’s child, the verbal language – no more, no less. I am more than sure the high order mammals have logic and consequently analogical thinking processes, but manifested implicitly, automatically. The second cognitive element they lack is metacognition, or self consciousness. Following this line of thought, the differences between humans and any other mammals are not qualitatively much! We both have very similar bodies and moreover – similar brains. All ancient brain structures presented in the mammals exist in humans too. The only difference is the more developed neocortex.

Analogy in children

Analogy is the main thinking ability of the human mind. According to the classical Piage school it emerges comparatively late. The recent studies show us that analogy making is available in very young infants. In the paper are presented different experiments and their results. The role of the analogical reasoning for the children’s understanding of biological, physical and psychological principles is explored. The importance of the “relational knowledge” and its underlying scripts (schemata) for analogy making is shown. The paper is interesting and useful for various scientific fields.

“LISA (Learning and Inference with Schemas and Analogies)

The central motivation in this model is to integrate two major processes of analogy formation: memory access and structural mapping – while preserving both flexibility and sensitivity to structure.

Traditional symbolic systems maintain structure but are inflexible and the connectionist systems are just the reverse – they are very flexible, but work only at low level. Past hybrid models have lacked a natural interface between the two. The LISA model attempts to reconcile the two approaches and unify access and mapping with both structure sensitivity and flexibility.

LISA can be divided roughly into two interacting systems: a „working memory“ (WM) and „long-term memory“ (LTM). LTM is a layered network of „structural“ units, and the bottom structural layer connects to WM’s single layer of semantic units.

Concepts and relations (in LTM) are represented as trees of structural units of three types: propositions, subpropositions, and objects/predicates.

Each proposition tree in LTM is a potential „analog“– the source or target of an analogy.

The semantic units of WM connect to and allow distributed representations of each object or predicate at the bottom of an LTM proposition tree. The more similar two objects/predicates are, the more semantic units they will share.

WM also includes a set of „mapping“ links between LTM structure units of the same type (eg. predicate-predicate, proposition-proposition).

Activity starts in LTM, in a particular proposition unit chosen as the target analog. Flashes of activity spread alternately down various competing branches of the driver’s structure units and activate patterns of semantic units in WM. These semantic units activate „similar“ objects and predicates, and activation spreads back up competing branches of other „recipient“ analogs. Recipients which are most strongly activated („retrieved“ from LTM) at any moment are considered the best „source“ analogs for the original target.

When structure units of the same type are active concurrently, the WM „mapping“ weight between them strenghtens; when structure units are uncorrelated, the connecting weight is weakened.”

It seems beautiful model to me. It has long and working memories, semantic network. It is a hybrid model that works in parallel and combines the advantages of the both perspectives. It satisfies all analogical constraints: pragmatical, semantic, so on. I would like to see this and the others models of analogy making and reasoning used in the creation of one advanced AI model that competes and overpasses the human abilities. I am sure it will happen soon with the efforts of the scientists!

ACME, SME AND IAM MODELS COMPARISON

Informational Constraints on Analogical Mapping

structural constraints

“Making matches only between entities of the same type. Attributes are matched with attributes, objects with objects, and two-place predicates with two-place predicates.”

We can only map on the same level: predicates with predicates, arguments or objects with arguments and objects.

“Exploiting structural consistency. If the propositions REVOLVES (A B) and REVOLVES (C D) match, then the arguments of both should also be matched Appropriately”

Mapping is relating deep structures between the base and target domains!

Favoring systematic sets of matches (Centner’s, 1983, systematicity principle). If one has two alternative sets of matches, then the mapping with the most higher order connectivity should be chosen.

“A similarity constraint can also be used to disambiguate alternative matches. When this constraint is applied, only identical concepts are matched between the two domains (Centner, 1983) or, more loosely, semantically similar concepts are matched (Gick & Holyoak, 1980; Holyoak & Thagard, 1989). Semantic similarity can be used to disambiguate matches; if one match in a set of one-to-many matches is more similar than the others, it can be preferred.”

As a matter of fact, similarity of the relational structures and in some cases of attributes is basic analogical mapping subject.

“Finally, there are pragmatic constraints (e.g., Holyoak, 1985; Holyoak & Thagard, 1989; Keane, 1985). Again, these constraints may disambiguate a set of matches. For example, in a certain analogical mapping situation, one match may be pragmatically more important (or goal relevant) than other alternatives and so it will be preferred over these alternatives.”

Pragmatic constraint allows the researcher to feed a goal into the model, thus constraining the many possibilities for mapping to certain predefined by the goal options.

For an adequate cognitive model of analogical mapping we need to elaborate the behavioral constraints on analogizing. These behavioral constraints reduce the set of possible algorithms that instantiate the informational level theory. Indeed, the addition of behavioral level constraints should result in algorithms that predict the detailed performance of subjects in analogical mapping tasks.

Behavioral Constraints on Analogical Mapping

Working Memory Constraints

“Working memory limitations may result in information loss, and thus produce errors in analogizing. When working memory is overloaded, some critical part of the representation of a domain may be lost or forgotten.”

Information loss is natural human ability. Either memory traces decay or interference or blending or overloading cause working memory malfunction. This results in analogical failures in retrieval, mapping, transfer, so on.

The Influence of Background Knowledge

“In deductive reasoning, background knowledge can facilitate or inhibit „correct“ reasoning depending on its relationship to the inferences to be made (Byrne, 1989; Cheng & Holyoak, 1985). Similarly, in analogical thinking, Keane (1991) showed that a mapping task can be performed faster if the set of mappings required are consistent with background knowledge than if they are inconsistent or neutral with respect to background knowledge. Apart from affecting the time course of performance, background knowledge may also be a source of errors in analogizing when the products of mapping conflict with background knowledge of a domain.”

Background knowledge is extremely important issue in analogy making. It inevitably interferes positively in case of similarity between the analogical subject and the background knowledge or negatively in the case of opposite or different background knowledge. In the last case analogy making is slower and less efficient and proper.

SME

“SME implements both structural and similarity constraints in a serial way. SME finds all the legal local matches between two domains and then combines these into alternative interpretations of the comparison.

When SME is run on the attribute-mapping problem, with an appropriate set of match rules, it generates 32 alternative interpretations. These are all the possible, maximal interpretations that can be generated for the problem (made up from all the possible matches).

Forbus and Oblinger (1990) extended SME to implement some pragmatic constraints and their „greedy merge algorithm“ reduces the number of interpretations produced to one (or a few) „best“ interpretations.”

SME is the first and very basic model for analogy making. It provides the primal features of the analogical process. It works serially and provides all possible matches of mapping. Of course, it is very much time and memory demanding model.

ACME

Holyoak and Thagard’s (1989) ACME uses parallel constraint satisfaction in an interactive network to find the optimal mapping between two domains. It implements the structural, similarity, and most pragmatic constraints. ACME establishes a network of units or nodes. Each node represents a legal match between two predicates. The excitatory and inhibitory connections between these nodes implement the various constraints.

ACME introduces new constraints such as pragmatic. It is the first neural network (connectionist) model in the analogy modeling.

IAM

“Keane and Brayshaw’s (1988; Keane, 1988b, 1990) IAM implements all the informational and behavioral constraints mentioned earlier using serial constraint satisfaction. It generates a single, optimal interpretation based on a small subset of the possible mappings between the two domains. I AM builds up this mapping incrementally by selecting a small portion of a base domain for mapping, mapping it, and then moving on to map another portion. Typically, it will construct a single mapping, which will tend to be the optimal interpretation. However, if necessary, IAM can consider several alternative interpretations. Again, it deals with these alternatives incrementally, one after the other.

IAM Algorithm

Select Seed Group. Rank order groups of connected predicates in the base domain and select the first in the list as the seed group.
Find Seed Match. Find a seed match from a selected element in the seed group and note alternative seed matches that may be possible from this element.
Find Isomorphic Matches for Group. Find all the legal matches between the elements of the selected group and the target domain and use serial constraint satisfaction to find a one-to-one set of matches that disambiguates these matches, using pragmatic, similarity, and structural constraints.
Find Transfers for Group. Add the transfers or candidate inferences supported by the matches found.
Evaluate Group Mapping. If the resultant mapping is evaluated as being good then continue (Step 6), otherwise try an alternative seed match (Step 2), or failing that try another group as the seed group (Step 1).
Find Other Group Mappings. If task demands require many groups to be mapped, then incrementally map each of the other groups (performing Steps 1-5 on each one), such that the mappings formed do not violate the mappings found already (as dictated by the constraints); otherwise, just return the mappings found for the seed group.

In particular, it selects that group of predicates that have the most higher order connectivity between its elements. Having selected this so-called seed group, it chooses an element from this group and finds the best match between this element and all the elements in the target domain (called the seed match). The seed match is used to grow the mapping of the other predicates in the seed group. All the legal matches between the other elements of the seed group and the target domain are found, and a unique one-to-one set of matches is formed.

The IAM algorithm goes through six main stages in analogical mapping: It selects a seed group and finds a seed match for that group; then it finds isomorphic matches and transfers for that group; and it will then evaluate that group mapping and find other group mappings if they are required. The IAM algorithm is specifically designed to reduce the processing load involved in analogical mapping in order to unburden a limited-capacity working memory.”

The Model AMBR 1 in brief

“Integration. The reasoning process cannot be partitioned into a sequence of independent stages performed by specialized module-like components. Rather, there are subprocesses that run together and each of them is potentially influenced by the rest. Each computational mechanism is responsible not only to produce its immediate result but also to create appropriate guiding pressures for other mechanisms. That is why AMBR is designed as an integrated model based on a parallel emergent architecture. “

Comment: I agree on that. Still, we do not yet know enough about how the real processing of information goes on such deep level as deductive/inductive/analogical reasoning! We can suppose that they are subprocesses of one and the same module but it will be only a good wish and intuitive presupposition!

“Unification. Analogy is not a specific mode of reasoning. Rather, deduction, induction (generalization), and analogy are slightly different versions of the same uniform reasoning process. The same computational mechanisms are used in all cases–there is some sort of perceptual processing that builds internal representation of the problem being solved, there is some (sub)process that accesses relevant information from long-term memory, there is some (sub)process that tries to map the new problem to previous knowledge, etc. Deduction, induction, and analogy all fit into the same framework, the differences being in the outcome of the processing but not in the processing itself. Thus the term deduction applies to cases when the new problem happens to match with a general old schema, induction goes the other way around, and analogy applies when the two situations are at approximately equal level of abstraction. Conceptualized in this way deduction and induction are just two extremal (and hence very important) points on the analogy continuum. Therefore AMBR is designed as a general model of reasoning with emphasis on analogy-making.”

Comment: Well, to state that induction and deduction are just extremal cases of analogy is quite brave statement! Is there some experimental data supporting such claim? The claim that the three general cognitive processes are just a subprocesses of one and the same indivisible cognitive process sounds very plausible from intuitive point of view – but once again ,are there any experimental data proving it?

“Context-sensitivity. Human reasoning is context-sensitive. Its outcome depends not only on the task and long-term memory knowledge but also on the environmental setting, recent activities of the reasoner, etc. AMBR is designed with the explicit aim to reflect this context-sensitivity of human thinking.”

Comment: Human thinking is context sensitive! And the context comprises not only the environment surrounding the reasoner (thinker), but his own state of mind as well!

“AMBR overcomes this difficulty by representing information in smaller chunks. The model does not represent episodes as big units that are either manipulated wholesale or not at all. Instead, it represents them as coalitions of small elements susceptible of piecemeal manipulation. This allows each subprocess to begin as soon as the previous one has produced some partial results.”

The above mentioned and shown overlapping of the processes allows the cognitive mechanism to work flexibly and explains very well the actual mental processes.

Integration of Memory and Reasoning in Analogy-Making:

The AMBR 2 Model

The article starts with the elephant metaphor – how several blind persons explored different parts of the same elephant and each of them claimed his findings are the only true… The authors Boicho Kokinov and Alexander Petrov imply that we have analogical situation in the contemporary science and particulary in the field of the cognitive research, where every line develops independently, neglecting the others: reasoning (deduction and induction, problem solving, attention), analogy, memory, emotions, neuroscience… All of them are “different parts of the cognitive elephant” and fixation on only one of them and excluding the others leads to distortions of the actual whole picture. In reality all of the cognitive processes function interdependently and in parallel. Consecuently the process of understanding, studying and modeling the human cognition ought to proceed simultaneously in all its subdomains (subfields of research).

In the most advanced models of human cognition reasoning, memory and perception are implemented together. Such models are AMBR 2, Copy Cat and Table Top.

The constructive nature of memory

In 1967 Ulric Neisser proposed interesting metaphor about the human memory: Reconstruction of dinosaur by paleontologist. In the same way we construct our memories. The results of our constructions are often false, or partly true. Contemporary research shows that the classic paradigm for memory as a storehouse fails to explain all the interferences, blendings of episodes, primings, the influence of the inner schemata, so on… Even in the case of flush bulb (photographic) memories the vivid mental picture and the confidence does not coincide with actual genuineness of memory – the real event often is completely different from the remembered flush bulb episodes.

The context effects

Lots of studies show the important role of the context – both environmental and cognitive (state of mind, mood). We have better recall in the same context where memorizing happened. The context interferes our memory recognition and the direction of our thinking process. The context influences the cognition in both bottom up and top down directions – thru the perception of outward context and thru the impact of the inner state of mind, mood and schemata.

As we see in the table above, the reminding phase in analogy making depends mostly on superficial similarity rather than structural similarity. Familiarity helps producing an analogy, the inner schemas assist the process and interfere directly in it. While the mapping process is happening, it awakes new related to the target episodes. Omissions happen up to the context of analogy making, i.e. the context influences the retrieval phase.

It is now well known that our memory is context dependent – both internal and external (environmental) contexts influence it. We have as well priming and blending effects.

Analogy models should be built on general architecture that comprises all the reasoning and memory properties and actions. The model ought to be context sensitive.

The AMBR View on Analogy

In AMBR we have many decentralized small units (neural network), which simultaneously plays role of symbolic (higher, top down) mechanism. The small nodes act as neural emergent network model, passing markers thru changing their weights. They form larger coalitions that process higher level cognitive functions – concepts, so on. In AMBR we do not have central processor, organizer of the whole system, but it is being organized by the collective work of all the nodes. Although it might sound strange, I would say that AMBR approaches in this way the Buddhist concept of Anatman, i.e. the lack of inner organizing principle of the mind, realized thru the millennium of research of the Buddhist practitioners. In Buddhism is said that there is no such thing as “I”, but just “Skandhas”, i.e. gatherings (coalitions) of informational nodes in the mind. May be the contemporary cognitive science thus approaches the ancient wisdom of the deep explorations of the mind.

The memory in AMBR is not static, but flexible, dynamic and context sensitive.

AMBR model integrates all the cognitive properties and processes in one general cognitive architecture. The work of the model is dynamic, context sensitive and emergent.

Important facts about AMBR:

The model has more realistic working-memory requirements since not all possible hypothesis are constructed, but only those that seem plausible and relevant to the current context. Thus a hypothesis is constructed only when (and if) at least one agent finds a justification for it. The justification might be on the grounds of either semantic similarity or structural consistency.

Mapping and memory processes run in parallel and thus can interact.
The hypotheses are constructed dynamically. As different agents run at different speeds, some agents (the more relevant ones) establish their hypotheses earlier than others. This head start helps the early hypotheses gain activation.
The constraint satisfaction network is constructed as part of the overall network of agents in the system. The activation can thus pass back and forth between the hypotheses and the representations of concepts and episodes. This allows for an interaction between memory and mapping tailored to the particular context.
The semantic similarity is computed dynamically and is context dependent. The computations are done by a marker-passing process and the markers are guided, restricted, speeded up or slowed down depending on the activation level of the agents which are processing the markers, i.e., depending on the particular context.
The structure-correspondence process is not limited by the n-ary restriction which was characteristic for all other models at that time (see Hummel & Holyoak, 1997; Holyoak & Hummel, this volume). Once the semantic similarity between two relations has been detected, AMBR1 can map them even if they do not have the same number of arguments. This is because the marker passing mechanism disambiguates the correspondence between arguments of the two propositions. The disambiguation is based on the semantics of the arguments which is represented in the network of agents. LISA (Hummel & Holyoak)

“The episodes are represented in a distributed and decentralized way. They are represented by rather big coalitions which do not have leaders, i.e., none of the members of the coalition has a list (even partial) of its members. There is a special member of the coalition which “stands for” the particular time and place location (it may be considered as a simple unique tag rather than a vector in some abstract space) and all members of the coalition point to it. This is the only way in which one can recognize that all these agents represent aspects of the same event. However, there are no pointers coming out of this special agent, i.e., it does not list any of the coalition members.”

AMBR is trying to represent the real human memory, which is decentralized and distributed.

Orlin Baev