Bay Area Artificial Intelligence Meetup Group Message Board › The oxymoron of defining words with words and its relevance to AI.

The oxymoron of defining words with words and its relevance to AI.

  • 1
  • 2
Lex Ricketts
Posted Aug 2, 2010 12:50 AM
user 11281101
Elk Grove, CA
Post #: 32
Send an Email Post a Greeting
Meaning is definition. But more than that, our commonality of experience allows communication to be understood. Finding commonality of the words is what we do when we learn other languages. Our understanding of words are applied to the other sound formation of the other language which represent similar experiences. However without the commonality of experience there would be no words or language. I could be wrong about this but I don’t understand how an expectation of intelligence can develop from data mining or similar word filtration techniques. Admittedly, the developers of these techniques have great talent and are very intelligent but it does not imply that the outcome of these methods will somehow convey this intelligence to the machine. In these data structures, there is no meaning that the system can associate with these words. The phrase “I have a sore toe.” could be understood by a system if the system had something it thought of as a toe and it was elevated to the condition of soreness.
Lex Ricketts
Posted Aug 2, 2010 8:57 PM
user 11281101
Elk Grove, CA
Post #: 33
Send an Email Post a Greeting
My complaint is that these techniques are seemingly a response to the requirements of AI but require a focus that draws needed attention away from how intelligence actually works. I see other valuable purposes served by the development of these methods but I don’t see the development of furthered understanding regarding intelligence taking place. Are we as a group avoiding this topic? I dare say that such consideration of intelligence represents a paradigm shift that few of this group have shown interest in. Perhaps I misunderstand the purpose behind the group’s intentions. Is this some sort of cast system where those who are qualified are allowed an opinion and all others are expected to exist quietly in awe with an occasional “Gee your wonderful!”?
A former member
Posted Aug 2, 2010 9:49 PM
Post #: 144
Lex,

The context of the subject of your argument needs to parsed more clearly.

I think I understand your position – that a dictionary, no matter how robust, well organized, and cross referenced, is meaningless without someone (or something) who can both read it AND associate the coded references-as-definitions to memories that ground the entries to actual physical sensation.

By "oxymoron" I believe you mean circular. I think you are saying that two machines sharing access to a dictionary or word association graph will never "understand" the meaning implied by a definition because the parsing algorithms that swim such graphs are necessarily circular (processing will go on forever without arriving at "meaning").

There are two ways of approaching this issue. One is to consider the implied use of such automatically processed word graphs. To some developers, the use is as a tool for human thinking and learning. In this use, such a system could be useful as a better dictionary, as a way to inform and refine and personalize or localize or contextualize search bot results. The other use for word graphs is strictly machine on machine (never involving humans). I doubt you would argue that word graphs are useful or could be useful when their intended audience is human, so I assume that your argument is directed towards the viability of meaning processing within machines so long as those machines only have access to word graphs and not to actual sensory channels.

I assume your argument is not that the word graphs haven't the required robustness and granularity to store a memory. I assume that your argument accepts as possible, word graphs matching if not exceeding the fidelity of human memory. If so, your argument is hinged on the capacity of the receiving system to have had access to the actual physical sensations to which the word graphs refer. Your argument, plainly, is that a system that does not have the physical sensory organs that were necessary for the acquisition of the experience codified within a word graph is a system that will never be able to derive meaning from that class of word graph.

Let's accept your argument as logically plausible for now and dissect it further. Does it mean that the mechanism of formalizing and storing a memory is dependent upon proximity to the sensory channels as source? If this is true, it implies the topological argument that the physical graph that is the discrete in-vivo network of input signal channels from sensors to memory graph must be present for any memory to be stored in a way that can be later retrieved and reconstructed. It means that the unpacking of a memory graph requires in real physical sense, a cypher that is the physical analog of the mechanism from which the memory was first produced. Logically, topologically, it would be difficult if not impossible to argue against this. But this is an entirely different argument than whether or not the memory once written can not be transposed into many different forms without loosing information. However, the more one abstracts a memory information from the original sensory machinery that acquired it, the more one is also required to translate the structure of that acquisition mechanism into the self same abstraction scheme.

Lets imagine a being, a device, with sensors and the mechanism to write the information stream from those sensors into memory. Now we will assume an information string that holds, in linear binary form, both the original sensory map (memory) and the full description of the physical mechanism that acquired it (the being). One would presumably then have all of the information necessary to build an translation mechanism from which to unpack the memory and the embodiment to the same fidelity as that system which originally acquired it.

As per the laws of information science, this same binary bit stream (memory) can be rewritten (translated) into an infinite number of self-consistent formalisms without loosing any information at all. Salient to this discussion is the degree to which the unpacking or comprehension of the original memory is dependent upon the accurate rebuilding of processing mechanism in the image of the original sensory mechanism from which the memory was derived.

It is clear from this analysis that the full information implied by the data stored as memory of a sensory stream would necessarily include the information required to rebuild on the receiving end an analog of the sensory mechanism.

A system that stores memories of its own experiences (an embodied mind) takes maximal advantage of the fact that it does not have to store the information necessary to reproduce the sensory network from which the sensory stream was acquired – it is that network! In this case, an embodied memory need only store that minimal amount of information necessary to re-excite the original sensory map from which it came. The information necessary to encode the full sensory map (the body) is stored in vivo as the body itself and does not therefor need to be stored in "memory".

The word graphs so often discussed by those of us working in AI, derived exclusively from an algorithmic crawling through of a corpus of written documents is an extreme example of memory strings abstracted from the information necessary to rebuild the cypher (the body) necessary for its processing into meaning.

But that isn't in itself a logical argument against the validity of the information held within those word graphs.

All it says is that memory formalisms are dependent upon the fidelity by which they can be unpacked towards a mechanism such as that which originally built the memories.

-- part 1 of 2 --

Randall

Lex Ricketts
Posted Aug 3, 2010 2:52 AM
user 11281101
Elk Grove, CA
Post #: 34
Send an Email Post a Greeting
Hi Randall,
You often demonstrate something that is rarely found. You actually consider arguments of which you might also disagree with. And when you do this, it’s more than just to dictate a dismissal; you actually constantly give an in-depth considered opinion. Thank you!
I used the term oxymoron because solely using words to describe words, without meaningful backup experiences seems incongruous. However, you may be right to point out that this is a circular arrangement. One set of words equal another set but none of this results in meaning. I was looking up some AI crap the other day and came across the concept of a homunculus. A homunculus is the equivalent of a ‘little person’ that provides the intuition and thought process that are often left out of AI type ideas. We might have some terrific idea to describe some aspect of intelligence but it requires a viewer to interpret the results of whatever function it represents. These concepts are described as toward AI but they cannot duplicate AI. This is true because they are trying to replace what they also need to function, intelligence. In these ideas I do not see any attempt to describe what intelligence is. What amazes me is that we have some quite snobbish people with degrees in AI when there is no understanding of what AI is.
Randall you are right to point out the importance of understanding these techniques as tools. However because they are tools they require a homunculus and therefore are not a description of intelligence. Also, if I am properly understanding this, word graphs would have the same problem if they require interpretation. Unless there is some part of this graphing system that can operate intelligently, it is still the viewer that adds the meaning. I didn’t hear anything about a sub-system of that nature in the last meet-up.
As part of the animal world we definitely have and share empathy with others. What I meant of meaning does not come from just one instance of an experience. It comes from everything related to whatever focus concerns you. It is the similarities of our previous situations that allows empathy. How could we understand otherwise what someone else is going through? We have been there! I am not saying that we can’t derive meaning from these word schemas but they will require a homunculus.
A former member
Posted Aug 3, 2010 2:53 PM
Post #: 145
-- part 2 of 2 --

There is a tremendous efficiency afforded systems that maintain a consistent sensory arrangement through time, and who communicate with other systems that are similarly structured. We are such systems. The tight compression afforded our verbal and written communication languages is possible only because the great majority of what we communicate is shared by all participants. In information as in thermodynamics, there is no free lunch. What isn't carried by the message must be carried by the cypher. In the communication between self-similar systems, the lion's share of this burden is born by the cypher.

In homogeneous communications, the message/cypher balance tips radically towards the cypher. The opposite is true in heterogeneous communication. When two radically different systems communicate, the message channel must carry both message and the subset of the cypher that isn't a part of the receiving system - the channel must carry both message and the directions for the construction of a simulation of the sensory and memory network in the system sending the message.

Imagine the complexity and size of the communication implied in the compact directive "Have dinner ready for me after work." should the receiving party not be a thing that eats or works.

What interests me with regard to these issues is the latency handling mechanisms required to interpret any message. It is unlikely that the full contents of memory (even the memory stored in synaptic neural nets) is active and accessible at all times. Some sort of retrieval and localization must be happening, a process of collecting those memories that are situationally salient (and hiding or suppressing all others). How does the size and structure of this local cache impact the processing of meaning? Is meaning processing dependent upon some method (such as a cache) of focusing attention upon some memories and not others? If so, how does this requirement, this collect and activate process, not create yet more abstraction between code and meaning?

Humans can and do understand the meaning of the word "pain" at moments absolutely devoid of the actual sensation of pain. At these moments, the human thinking apparatus must derive meaning strictly from a storage analog of the word graphs you find so problematic in AI systems.

In fact, when reading or watching a film, humans can connect with many meanings that they have had absolutely no physical access to. I wince when I watch someone die (though I have not yet died). I tear up when I read about a woman raped (though I am neither woman nor have I been raped). I have even shared the joy of victory with Muhammad Ali as he won title fights (even though I have never boxed).

Will a machine who's access to the past is restricted as is ours to memory, be any less able to understand reality?

I wonder.

Ultimately though, I do believe that any argument based upon the notion that meaning is impossible in the absence of body is logically incomplete. Such arguments are based on the assumption that "body" is different than "information". The good folks who brought us information science and the logic of limits (Shannon, Botzman, Godel, Turing, etc.) have long ago proved that structure and data are equivalent. Therefore, like it or not, body is message and message is body. It is not therefor possible to build a disembodied message, or a body that isn't a message. What you can talk to and do math upon is the hierarchy of influence that structures the causal relationship between body and message, between message and cypher.

No word graph is meaningful in and of itself, it demands a cypher, a processing method that knows how to unpack the relationships implied by the connections between the nodes as words. This cypher is integral to the word graph. There is no talking of one without talking of the other. There is no measuring the complexity or size of one without measuring and adding the other.

When I was a kid, my father, a public school teacher, took a moonlighting job teaching math to inmates at a state prison. He told me that the inmates had assigned numbers to jokes. Someone would yell out "33" or "12" and laughter would echo down the hallways. A good example of the advantage of adding girth to cyphers such that messages can be compressed.

What is oxymoronic, what is circular, is any examination of the relationship between data and meaning that doesn't simultaneously talk to both sides of the message/cypher set.

Randall
Lex Ricketts
Posted Aug 4, 2010 12:42 AM
user 11281101
Elk Grove, CA
Post #: 35
Send an Email Post a Greeting
Randall,
You bring up some excellent points.

“How does the size and structure of this local cache impact the processing of meaning?”

The “size and structure” of available memory would seem to me to be the limiting factor regarding complexity within the animal world. Memory ability or whatever comparable biological system that serves this function, equates to the complexity of what we can decide. From memory ability we are able to conceptualize. Smaller memory ability and word size equates to simpler concepts per unit of time. Conceptualizations are first recalled and then operated upon, regardless of the complexity of the organism doing the conceptualization. Fortunately, the interesting thing is that the decision-making ability of complex creatures doesn’t seem any more complex than that of the simplest creatures. Basically all decisions come down to < > or = in some form or another. While considering what you’re reading here, you are thinking is “Is this idea <> or = mine”. Or perhaps you may not have evolved thoughts regarding this and are comparing it to something you’ve heard, but the decision is still “Is this idea <> or = that”. To expand further, from my perspective, involves relevant data retrieval systems and your next question.

“Is meaning processing dependent upon some method (such as a cache) of focusing attention upon some memories and not others?”

This is where that homeostasis and needs list comes in to play. It seems to me that our experiences are setup and filtered as they happen with regard to what our needs are. The meaning we ascribe to some “something” is dependent on what our homeostasis was at the time of the experience and in what way that “something” fulfilled whatever need we had. This is the mechanism we use to gage our survival potential. To understand the potential of this and the effect it has, you must keep in mind the dynamics involved. Homeostasis is dependant on many needs structures being satisfied or at least kept within the range of normal, at any given time. As I have mentioned before, when one need drifts away from normal it receives our main attention. However, all other needs are still being attended to but to a lesser degree. Opportunities to satisfy less urgent needs arise during the time that the current needs priority is being satisfied. This information is at the kernel of how our behavior becomes so unpredictable and unique to ourselves. It is the seeds we coincidentally gather of which we can apply to resolve problems of future need requests. Because the beginnings of these conceptual structures happen at the beginnings of life, very simple concepts are seen from many perspectives. With each shift of perspective these simple concepts grow more familiar and predictable. Also, at this time, without having learned behaviors to draw upon, we are infused with reactive alarm system type behaviors. The more complex our concepts become the less recognizable their connection to these basic needs structures are and the less necessary the alarm system type behaviors are.
A former member
Posted Aug 4, 2010 10:33 AM
Post #: 146
Yes Lex, a being obviously needs a method of prioritizing multiple parallel functions. But my focus in this discussion is on the issue as you defined it in your first post. I start with the premise that all thinking systems are at base equivalent. Turning Machines are Turning Machines. Computers are Computers. Brains are Brains. Turing Machines are Brains, etc.

Accepting the fact of computational equivalence allows one to compare methods that might otherwise seem categorically disparate. So, from this standpoint, I am looking at information processing in total. From this thousand foot vantage, one can see that "memory" and "structure" are equivalent and transmutable. The greatest achievement in the study of information is the principle of the conservation of information. This is as true, and for the same reasons, as its thermodynamic twin, the conservation of energy. Germane to the topic of intelligence, the information in any system can be restructured in many ways without violating the conservation of information. Achieving this balance in a message/cypher or data/processing system means that the reduction of one side of the balance necessitates a growth in the other side. You are free to compress your message or data, but only if you conserve the total information of the system by adding to the complexity of the cypher or processing side. If the balance isn't maintained, information is lost.

You contend that data can not be unpacked or comprehended by a system unless that system has an equivalent sensory system as the one that originally collected the data.

Applying the principle of the conservation of information we are forced to recast your argument. By this principle, the sensory system you posit as essential is, like any system, just a set of information. As such, that system can be rebuilt or restated (without loss) in any self-consistent language or structural grammar. This is not to say that the translation is simple, just that it is possible.

So, your original thesis needs to be broken down into two very different arguments. The first is that a system can only fully comprehend data if it shares the sensory set of the system that collected the data in the first place. The second is that data, all data, any data is incomplete without a cypher or processing scheme that can unpack that data. Information science has given us the answers to each of these questions. By these laws, your first contention is false, and your second contention is true.

What this really means is that from the point of view of a thinking system (a system that processes information), there is no difference between a robust simulation and the "reality" to which it refers. Which ultimately means that any sensory system can be simulated Turing style as a processing scheme that perfectly apes the effect of original system on a thinking system.

Randall
A former member
Posted Aug 4, 2010 12:31 PM
Post #: 147
What tweaks my interest is the relationship between data, the structure necessary for the retention of data, and the complexity necessary to process data stored in any particular scheme.

When we say "data string" we expect a single file line of data units. But what does it take to insure that these data units (ultimately ones and zeros) stay in line and retain their original order? How does this most basic of signal processing necessities impact the data that can be expressed in any given structural scheme? One could, in the context of this discussion, define a sensory system as a type of data structure of great latency. Other parts of a biological system's information processing are much less stable. The brain seems to facilitate the creation and manipulation of information structures that have much shorter life spans (moments or days).

The total information contained in a system is the sum of three measures: 1. the information as messages or data in memory, 2. the information (structure) required to keep that information stable, and 3. the information needed to build the system that process the data in messages and memory.

What can be known about a system through a measure of the ratio of these information components? What is the relative cost of stability, plasticity, and processing? What kinds of data can't be stored easily or efficiently in short term structures vs. structures of great stability? What is the overhead involved when one needs to transfer data from short term storage to long term storage? What parts of a system are most effectively stored in long term structures and which types of data are hindered by stability? Is there a metric that can be used to measure to the ratio of data type to data structure? What, with respect to this topic, is the qualitative difference between the data acquired by sensory organs (touch, smell, taste, hearing and sight), and data acquired from neural nets (memory)?

Randall
Lex Ricketts
Posted Aug 4, 2010 5:58 PM
user 11281101
Elk Grove, CA
Post #: 36
Send an Email Post a Greeting
Randall,
You said, “You contend that data can not be unpacked or comprehended by a system unless that system has an equivalent sensory system as the one that originally collected the data. “. Your right I do.

Obviously, animals can’t transfer information without some form of symbolism. The importance of this is that these symbolic systems are the animal world equivalent of a com wire. I think you pointed out that words act as information packets. I think we agree that as such they only suggest the processes that pre-exist them. A machine, however, can have direct communication between other machines. So we could simply clone all pertinent systems and have some form of bodiless intelligence. But, to the clone the world would no longer make sense. All of its reference points would no longer exist in reality. None of the information of how to satisfy the simplest survival requirement would apply. I doubt that it would loose consciousness, if it had it, but consciousness would only add to the problem. The machine would no longer have purposeful reason. I would (do) agree that you could create a machine with an intelligent core and a simulated body, but that is the long way to go to get around the block. The problem becomes congruency. Supposedly we would be simulating input from sensors of the real world. To us animals the items of the real world are evolved, they are contingent and they are constant. There is a continuum that exists that allows the world to have a sense of predictability. I think that it’s not impossible to accomplish such a simulation just easier to give it a body.

What can be known about a system through a measure of the ratio of these information components? What is the relative cost of stability, plasticity, and processing? What kinds of data can't be stored easily or efficiently in short term structures vs. structures of great stability? What is the overhead involved when one needs to transfer data from short term storage to long term storage? What parts of a system are most effectively stored in long term structures and which types of data are hindered by stability?

I believe biological systems that have the capability to manipulate their environment demonstrate quite well differing ratios of these components. Natural selection has seen to this. Among other things, machine behavior will be dependent upon variations of these ratios. Regarding structure, as I have discussed earlier, I believe in an intensive input structure. A short-term buffer to store contiguous input from as complex an input structure as possible will determine what clarity will be available to the thing for upper level processes. There should be enough short-term memory to set up what I would consider to be a survival cycle. That is to say enough to store data from an information gathering period (in animals a consciousness period) and an information-sorting period (in animals sleep). It may not be necessary to duplicate the animal sleep cycle but I suspect creativity will require all subsystems to contribute to this function (this is another topic). Perhaps the sorting period may be carried out simultaneously in some form or another, experimentation will tell. As data arrives, it is compared to the limits for that sensor set to check for normality. This will tell it how the world feels. Also, somewhere at this level internal needs requirement need to be monitored. After that, depending on internal needs higher level process take place. Some form of code that duplicates the following behaviorist type action will be required “Successful behaviors need to be repeated.” Successful behaviors are in reference to internal needs requirement or homeostasis. This is the data we use to conceptualize the world. After this so called “survival cycle” prioritized data from this cycle is transferred to long term storage. Long term storage allows for what I consider to be thought. Thought is a data retrieval structure not much more complex than a sort. Thinking involves retrieval of data relevant to whatever need triggered the process. These retrieved concepts are than decided upon. More needs to be said but I am trying to keep this brief and it may not be working.



“Is there a metric that can be used to measure to the ratio of data type to data structure?”

If I understand your question the ratio involved would determine what kind of behavior was expected from the machine. The metric would be the kinds of needs it has. By adjusting the need referances I would expect that any number of requirements could be met. Experimentation is the only option that I am aware of that could determine this at this point. Indeed this is where the availability of our control over these machines will exist.
A former member
Posted Aug 5, 2010 12:34 PM
Post #: 148
Lex, we are having two separate conversations. I am talking to universal parameters of information, communication, and computation. You seem to be struck at the – biology has body, machine isn't biology, so machine can't have body – anthropomorphic road block. I am interested in universal definitions of these words and concepts. How can we define "body" such that it is independent of biology or even of things that have 'limbs' or other sensory appendages? How can we define intelligence such that it is a continuum instead of a limit bounded island? How can we logically relate data, communication, sensation, memory, message, processing, computation such that we can disambiguate the all-too-familiar and all-too-opaque term; "intelligence".

A body is a physical structure. Physical structures can be translated losslessly into information that can be processed such that the input/output is identical. This is a fact of nature (not my opinion). It doesn't mean it is easy to do so or that it is efficient to do so or that it would ever be practical to do so, but it is possible to do so. Also, if you understand the equivalence principle that relates energy and information or energy and structure, you are forced to acknowledge (whether or not it is comfortable to do so) that it is impossible to build a system that doesn't have a body.

And that should end the philosophical debate regarding embodiment.

I think you are simply advocating for a richly interrelated memory graph in which data units are related contextually by many attributes. But as I have explained, this can be achieved many ways. The weight of the memory graph richness can be hefted largely by the graph itself or it can be hefted largely by the processing scheme, the cypher, that swims the graph. In the case of the corpus based word graphs being pursued by many people now working in AI, the emphasis is on building the most sparse possible graph and still maintaining as rich a connection graph as one can pull from the corpus. Such a graph, it is argued, will allow fast and flexible parsing by many kinds of processing schemes. The weight of semantic interpretation in this case has been shifted heavily towards processing, towards the cypher.

Components of such word graph schemes assume that situationally derived information (the information that ends up in a corpus) can be reformatted such that processing can be performed far more efficiently than is possible simply by searching the linear string of text that is the whole corpus. Both schemes are ultimately derived from string searches, but the word graph scheme is efficient because it performs the linear (and very very expensive) search and index first, allowing all subsequent processing to happen later and much faster and more cheaply.

Such word graphs are usually assumed to be lossy (one could not reconstruct the corpus from the word graph no matter how much time or processing power one had at their disposal).

None the less, much information is retained. And word graph advocates argue that there is evidence that our brains work the same way, storing the lion's share of the information that is memory, in the connections between nodes. The assumption that drives this work is that the existence of such graphs will spur the development of processing schemes that can derive concept level meaning (intelligence) and that these graph crawlers will be far simpler and more efficient than the algorithmic alternatives (based upon models of intelligence).

The great promise of graphs is that they naturally fall towards hierarchies and that these hierarchies map to what we would call the gradient between general and specific. A good graph linearly orders terms like skip and trot and gallop and roll and slither on one end and move or motion on the other. There are lots of specific terms on the outside of any word graph and just a few words at the center. The idea is that this specificity hierarchy is either enough (meaning can be derived from the shape of the graph itself) or barring that possibility, models can be built of a small number of base concepts (words at center of graph) such that specific meaning can be derived from graph modifications from these base models.

Lex, if after internalizing these concepts, you can still build an argument that says word graphs can't be built such that processing schemes can be built such that meaning can be derived, I would be interested in hearing it.

By the way, proponents of corpus sourced word graphs contend that they will (at least theoretically) be able to derive more accurate word use "understanding" than is possible with dictionary-like word definition data bases. And importantly, this isn't an either-or scenario. Word graphs systems can include dictionaries and encyclopedia in their source corpus. And, as mentioned above in the case of base concepts, word graphing systems can refer to definitional references as needed.

Importantly, word graph proponents defend their methods as more pure and honest because no humans are tasked with the responsibility of interpretation, a process fraught with ambiguity, bias, and inaccuracy. Done right, word graphs represent the empirical measure of actual language use across a broad situational spectrum.

Randall

  • 1
  • 2
Powered by mvnForum

Syntience Inc.

AI research company. Provides video equipment, time, and web space

Offer a perk for our members and get exposure.

Offer a perk →
Other nearby
Meetups
Why these groups?
x

The Meetup Groups shown here are topically similar to Bay Area Artificial Intelligence Meetup Group.

Groups are more likely to be displayed here if they:

  • have a Meetup scheduled
  • have a high rating
  • have a group photo
  • are "public" and not "private"
  • have shown they are likely to stick around (older than 30 days)
Find more Meetup Groups
near Menlo Park

Log in

  • Not registered with us yet?
or

Log in to Meetup with your Facebook account.

Sign up

or

Join this Meetup Group even quicker with your Facebook account.

By clicking the "Sign up using Facebook" or "Sign up" buttons above, you agree to Meetup's Terms of Service