Posted by: bgoertzel on: 2008-10-22
Joel Pitt has done some experiments testing first-order PLN inference in OpenCog, on some very simple data.
These experiments don’t use the indefinite probability formulas but rather the good old fashioned SimpleTruthValue PLN formulas.
What they involve is using PLN to extrapolate indirect word associations, from direct words associations mined from text (by some statistical text mining software created for OpenCog by Linas Vepstas).
This obviously does not stress the generality of PLN as an inference framework (no VariableNodes! no quantifiers! no intension! no fuzzy MemberLinks!). There is nothing particularly revolutionary AI-wise here … it’s just some fairly straightforward, state-of-the-art statistical NLP … Hebbian learning on a neural net, among many other techniques, could do basically the same thing … but this is a reasonable “smoke test” of the ability to load a bunch of nodes and links into OpenCog and perform some basic inference processes on them. One nice point about PLN is that it can handle relatively simple, associative-neural-netty stuff like this, as well as more complex reasoning involving variables and quantifiers and such, all seamlessly within the same mathematical, conceptual and software approach.
The reason I decided to write a blog post on this is that Joel produced some nifty pictures based on his work, using the open-source graph visualization package Tulip.
Here is a big nasty network of nodes and links in OpenCog, before inference:
Here is the same network, after some first-order PLN inference, with the inferred links in green:
Obviously the above don’t tell you too much. Tulip was configured so that nodes representing more greatly similar words (in terms of their statistical association) would generally be placed closer together in the visualization. Slightly more insight is given by zooming in, using Tulip, to see some of the nodes and links close up. Again, the links in green are the products of inference:
Note that in the immediately above example, to build the associative link between “foreign” and “administration” requires the system to make two inferences in sequence:
Posted by: ferrouswheel on: 2008-09-18
A few weeks back Ben announced he’d be running IRC tutorial sessions on OpenCogPrime. Last night was the second tutorial, and was on the topic of knowledge representation - introducing people to the basic concepts of the AtomSpace, such as Atoms, Nodes, and Links and how various types of each represent things in OCP. If you missed out, there are logs linked from the wiki for both sessions (and future session should also end up logged and available from the wiki).
Lastly, Ben also donned a wizard’s hat for the event:

Posted by: bgoertzel on: 2008-09-05
This summer OpenCog was chosen by Google to participate in the Google Summer of Code project: Google funded 11 students from around the world to work on OpenCog coding projects under the supervision of experienced mentors associated with the OpenCog project, and the associated OpenBiomind project
Applying for GSoC was David Hart’s idea originally; and, David and the Singularity Institute did a lot of the work needed to make it happen — so I need to extend very hearty thanks to the both of them, as all in all it worked out wonderfully well.
There were plenty of ups and downs over the summer, but overall the GSoC projects went extremely well and a lot of fabulous work got done. Furthermore, a number of the projects are going to be continued during the fall and beyond, either via students continuing them as course or thesis projects, or via students continuing to work on them in their spare time … and in one case, via a student being funded to continue the project by a commercial organization interested in using their OpenCog work.
OpenCog is a large AI software project with hugely ambitious goals (you can’t get much more ambitious than “creating powerful AI at the human level and beyond”) and a lot of “moving parts” — and the most successful OpenCog GSoC projects seemed to be the ones that successfully split off “summer sized chunks” from the whole project, which were meaningful and important in themselves, and yet also formed part of the larger OpenCog endeavor … moving toward greater and greater general intelligence.
Should OpenCog be chosen to participate in GSoC next year, I believe the projects will take a quite different flavor because OpenCog will be more mature then: I would hope to see more 2009 OpenCog GSoC projects involving the integrated functionality of the OpenCog system. But this year OpenCog is young so the best approach was to have students work on various important pieces of the overall system, and that’s what happened, generally to quite good effect.
This page
http://opencog.org/wiki/GSoCProjects2008
contains brief summaries of the projects that were done, and links to Web pages, blogs and code repositories allowing you to dig in more detail into the work if you’re interested. Here I’ll just give an extremely high level summary.
Many of the projects were outstanding but perhaps the most dramatically successful (in my own personal view) was Filip Maric’s project (mentored by Predrag Janicic) which involved pioneering an entirely new approach to natural language parsing technology. The core parsing algorithm of the link parser, a popular open-source English parser (that is used within OpenCog’s RelEx language processing subsystem), was replaced with a novel parsing algorithm based on a Boolean satisfaction solver: and the good news is, it actually works … getting the best parses of a sentence faster than the old, standard parsing algorithm; and, most importantly, providing excellent avenues for future integration of NL parsing with semantic analysis and other aspects of language-utilizing AI systems. This work was very successful but needs a couple more months effort to be fully wrapped up and Filip will continue working on it during September and October.
Cesar Maracondes, working with Joel Pitt, made a lot of progress on porting the code of the Probabilistic Logic Networks (PLN) probabilistic reasoning system from a proprietary codebase to the open-source OpenCog codebase, resolving numerous software design issues along the way. This work was very important as PLN is a key aspect of OpenCog’s long-term AI plans. Along the way Cesar helped with porting OpenCog to MacOS.
There were two extremely successful projects involving OpenBiomind a sister project to OpenCog:
Two projects dealt with improvements to OpenCog’s probabilistic program learning system: Shuo Chen (working with Moshe Looks) experimented with ways of improving the internals of the MOSES algorithms; whereas Alesis Novik (working with Nil Geissweiller) implemented an initial version of the PLEASURE algorithm, an alternative to MOSES that shares some of the latter’s code infrastructure. Both these difficult research-coding projects yielded promising though preliminary results and will be continued into the fall.
And the list goes on and on: in this short post I can’t come close to doing justice to all that was done, but please see the above page and the links in it for more details!
Costa Ciprian worked with Boris Iordanov on designing and creating a distributed version of the HypergraphDB, a persistent store for OpenCog; and Rich Jones worked with David Hart on creating a distributed web crawler suitable for massively distributed text parsing using OpenCog’s RelEx language parser.
In a different direction, Kino High Coursey (working with Andre Senna) designed and implemented a very elegant approach for interfacing between OpenCog and online simulation worlds such as OpenSim, implementing a framework using LISP to execute OpenCog-originated actions in simulation worlds. There is (conceptual and code-level) work to be done integrating this with other OpenCog work that involves OpenCog control of agents in simulated worlds, but Kino has introduced some excellent code and ideas into the project that is sure to be of value as things unfold.
Junfei Guo (working with Ben Goertzel) attacked a problem deep in the heart of OpenCog: mapping OpenCog’s unique AtomTable hypergraph knowledge representation into the more standard graph format used by the standard open-source Boost Graph Library. This opened up some important new discussions regarding the extent to which various graph algorithms (applied to the graph derived from a hypergraph) can serve as heuristic approximations to less-tractable hypergraph algorithms.
Elizabeth Dawn Alpert (working with Luke Kaiser) investigated the problem of making the link parser (used within OpenCog’s RelEx language framework) better handle ungrammatical text as seen in chats, IM, Twitter and so forth. This proved a thorny issue and the most progress was made on the level of cleaning up ungrammatical formats of individual words.
All in all, we are very grateful to Google for creating the GSoC program and including us in it.
Thanks to Google, and most of all to the students and mentors involved.
Onward!
Ben G
Posted by: linasv on: 2008-08-17
I hack, heads-down, on link-grammar every now and then. Yesterday, I fixed another round of broken parse rules: making sure that sentences like “John is altogether amazingly quick.” “That one is marginally better” “I am done working” “I asked Jim a question” “I was told that crap, too” all parse correctly.
Solving these required adding new rules to the link-grammar dictionary: so, for example, adding the rule “O+ & {@MV+}” to the dictionary entry for “asked” — thus allowing the word “asked” to take a direct object (”…asked Jim…”) and an indirect object (”… a qeustion”).
Patching dictionary entries by hand is tedious and time consuming. The long term goal is to get to the point where the system can learn new entries on its own. So I’ve been day-dreaming about how to do that, and some of the baby-steps in that direction.
The first step is to put all of the rules into a database where they can be easily modified, added-to, and deleted. Currently, the rules all live in a file, which can only be hand-edited.
A second step is to assign probabilities to the rules: often, multiple parses are possible, yet not each rule is equally likely to lead to a correct parse. So it would be better to have a probability assigned to each rule, indicating how often it leads to a good parse. These probabilities are already needed for the GSoC project of adding a SAT solver to link-grammar. I’ve got a database of tens of millions of word pairs and link types, ready to go for this step.
A third step is to distinguish good parses from bad ones: this is the parse-ranking step: to judge parsees for the likely-hood of being good, or bad. There are superficial things one can do for parse ranking, and quite complex ones. I am hoping to someday-soon reinforce the parse-ranking scores with the results of word-sense disambiguation. Which brings us to the next point:
Step four — some grammar parsing rules are appropriate only for some senses of a word, but not for others. Link grammar already accounts for this at a rough scale: it has distinct rules for different parts of speech. These are indicated by tacking a single extra letter to the end of a word: walk.v (I walk.v) versus walk.n (I took.v a walk.n) This basic idea can be further refined, for finer divisions than just parts of speech: some word senses can only be used in certain ways, and not others. The technical problem is that 26 letters is not enough… already, link-grammar is using some 20 or so of these suffixes. So this mechanism needs to be expanded, somehow.
Step five — where teh rubber hits the road — actually learning new rules. This is much more vague; my ideas are still swimming. There are two distinct problems: new words, and fixing rules for existing words. New words should not be *much* of a problem: try to find synonyms for a word, and assume the new word can be used like the synonym. Modifying existing rules is much much harder…
… so hard, in fact, that I’m not sure I’m ready to engage in that, just yet. Some steps can be taken in that direction, by looking at minimum-spanning-tree dependency grammars (MST grammars): that is, computing the mutual information of word pairs, and comparing them to the output of link-grammar. This should suggest where new links can be created. Then comes the question: what should the appropriate link type be, for this new link? For some words, perhaps synonyms can help, but other words are so unique, that they have no synonyms: the word “to be” is so central to the English language that trying to discover information about it is very difficult: it has no synonyms, even as it has many.
I’ve got some infrastructure set up to run this last experiment: I’ve got a fairly large collection of parsed sentences, from which I can build mutual information pairs, and compare these to link-grammar parses. In fact, I’ve already done so: I use these for parse ranking. (that is, if a link-grammar parse has a high mutual information content, then I assume its a good parse).
I could turn this around: given a sentence that is parsing badly, I can look for high mutual-info parses. Perhaps broaden the coverage by comparing to parses of approximately synonymous words, perhaps reinforced by word-sense disambiguation. But this generally leads to a combinatorial explosion — which, I guess is expected. We know that general intelligence requires a lot of CPU. The trick is to have the patience to set up an experiment, run the experiment, wait for its results, and then do it again…
Enough for now. I’m off to work on “related words” — another part of the puzzle.
Posted by: bgoertzel on: 2008-08-10
I have decided to run a series of IRC sessions focused on collectively discussing the OpenCogPrime design, via working through the OpenCogPrime wikibook and discussing the ideas therein chapter-by-chapter.
Details are at http://opencog.org/wiki/OpenCogPrime:TutorialSessions
The sessions will be weekly and will start September 10 (I’ll be out of town the first week of Sep, and figure too many folks are vacationing in August).
As there are 17 book chapters, this means the overall tutorial will run till late January (given a couple weeks off for Xmas).
All who are interested are invited to attend: however, it is emphasized that the purpose of the tutorials is **focused discussion of OpenCogPrime**, rather than general discussion of AGI issues, discussion of other AGI theories and approaches, etc. If your opinion is that OpenCogPrime is not worth discussing so much, that’s fine — then you probably shouldn’t attend.
Logs of tutorial sessions will be posted on the OpenCog wiki site for those who are not able or interested to attend but are curious what went on.
As well as helping spread understanding of the OpenCogPrime design for a thinking machine, I believe this will also help me refine and improve the wikibook, and maybe even the design as well.
While the book focuses on conceptual issues, it will not be problematic if the discussion veers onto implementation and software design issues relevant to the book chapters under discussion … in fact this would be quite desirable.
Posted by: bgoertzel on: 2008-07-31
The purpose of this blog post is to announce the release of a wikibook outlining a design for a specific AGI system intended to be built on top of the OpenCog framework.
This system design is called OpenCogPrime, and is heavily based on the Novamente Cognition Engine design under development at Novamente LLC during 2001-2008.
The OpenCogPrime design is proposed along with the hypothesis that, if the design is fully implemented and various important details are further refined, it may be able to form the basis of an AGI system with intelligence at the human level and beyond.
Of course, even in the case that this hypothesis is correct, it is difficult to estimate the amount of work that will required to create a human-level thinking machine according to the OpenCogPrime design. Levels of optimism among those involved with the project vary. My own (Ben Goertzel’s) personal intuition is that a human-toddler-level AGI could be created based on OpenCogPrime within as little as 3-5 years, and almost certainly within 7-10 years. The path from a human-toddler-level AI to an AI operating at the level of an adult human scientist is less clear, and could plausibly be even more rapid … or else much slower, depending on various factors (which are important and fun to consider, but would bloat this blog post too much…).
Clearly, there could be major unforeseen obstacles along the path to creating a powerful OpenCogPrime-based AGI; and it may turn out that OpenCogPrime is not a viable design for human-level AGI, for reasons that aren’t now anticipated by the system architects. But even if this is the case, we are confident that the process of refining, implementing, testing and teaching OpenCogPrime-based AGI systems will have a great deal to teach us about AGI and computing and cognitive science.
Onward, toward progressively advancing, maturing beneficial artificial general intelligence ;-)
ben
Posted by: ferrouswheel on: 2008-07-09
An article in Wired from a while back on Piotr Wozniak (no relation to Steve), a researcher of optimal memory and learning strategies, got me thinking about learning theory and memorization in the context of OpenCog. From the article (emphasis mine):
Long-term memory, the Bjorks said, can be characterized by two components, which they named retrieval strength and storage strength. Retrieval strength measures how likely you are to recall something right now, how close it is to the surface of your mind. Storage strength measures how deeply the memory is rooted. Some memories may have high storage strength but low retrieval strength. Take an old address or phone number. Try to think of it; you may feel that it’s gone. But a single reminder could be enough to restore it for months or years. Conversely, some memories have high retrieval strength but low storage strength. Perhaps you’ve recently been told the names of the children of a new acquaintance. At this moment they may be easily accessible, but they are likely to be utterly forgotten in a few days, and a single repetition a month from now won’t do much to strengthen them at all.
So, in memory studies, they talk of storage strength and retrieval strength. In my observation, retrieval strength is analogous to the distance of an atom’s short term importance to the attentional focus. Atoms just below the attentional focus threshold will be much easier to retrieve, both because there is more chance that they are stored locally and in memory and because they are more likely to be used by mind agents than atom’s with lower short term importance. Storage strength, on the other hand, is related to the long term importance of an atom. Atom’s with very low long term importance are unlikely to persist in the atom space, or more accurately they’ll be preferentially forgotten over atoms with higher long term importance.
One of the problems [with learning] is that the amount of storage strength you gain from practice is inversely correlated with the current retrieval strength. In other words, the harder you have to work to get the right answer, the more the answer is sealed in memory.
Perhaps they’ll be a need to incorporate this. Something that is persistently of use, but only becomes useful at significantly spaced periods of time. Requiring an OpenCog instance to reason about or research the fact again each time would be inefficient if there is some way of recognizing this long term trend. In a way, this is where something like a System Activity Mining agent might come in to play, to data mine such trends, however… I’m personally unsure about whether such an agent will scale to working on an entire atom space, particularly if it’s trying to detect these long term and infrequent trends.
One way to implement the storage of these infrequently but consistently used atoms is to assess the velocity at which short term importance changes when bringing an atom into the attention focus. This velocity would be greater for atoms which are harder to recall since they have further to travel to reach the attentional focus. A higher velocity could then confer more long term importance when an atom entires the attentional focus rather than getting long term importance purely from the stimulus reward system.
Posted by: ferrouswheel on: 2008-07-06
Well, I’d noticed it’s been a while since we’ve had a post here and thought I’d rectify it with a brief note of what I’ve been up to. Posts of more substance are on their way I promise!
Recently we’ve been trying to play with some new mechanisms for the spread of Short Term Importance in OpenCog. The initial implementation worked alright, but not sufficiently well to make the Hopfield network emulation perform admirably. Ideally we’d like to have a emulator comparable to traditional Hopfield networks that has the added benefit of effective continuous learning. Now, perhaps this focus on a toy problem is unnecessary, but if we can come up with a continuous learning Hopfield network then they’ll be a scientific paper in there somewhere, which will lend to OpenCog’s credibility in the future.
Initially I implemented an importance spreading mechanism that ensured importance couldn’t spread uphill to atoms with higher importance. This performed somewhat better than the existing system (based on very cursory evaluation), but was still somewhat hacky.
Ben suggested an example analogous to diffusion, which uses a Markov matrix to represent the transition probabilities of STI moving along a Hebbian link from one atom to another. Thus I went and implemented that, which works super effectively when imprinting a single pattern - but doesn’t work so well with more than one. I’m now going to modify how the the transition matrix is constructed based on what atoms are in the attentional focus, which should prevent importance diffusion being the free for all it is at the moment.
And in between doing this Attention Allocation stuff, I’m trying to port an implementation of Probabilistic Logic Networks from Novamente (OpenCog’s benevolent ancestor) to OpenCog. Theoretically that shouldn’t be too hard… but it’s quite complex code in places. Smart pointers to vectors of trees of predicates, oh my!
Posted by: ferrouswheel on: 2008-06-08
As a toy problem for playing/testing/understanding attention allocation in OpenCog, I’ve been emulating the behaviour of a Hopfield network within the OpenCog AtomSpace.
For those not already aware, a Hopfield network is a kind of recurrent neural network that acts as an associative memory. It consist of a number of units linked together. These units store and display the patterns in memory. The patterns are stored in a Hopfield network by adjusting the weights of links between units, based on the association between active units in a given pattern.
A result of this, is that if you give the Hopfield network a partial pattern, that is slightly incorrect, it should still retrieve the complete pattern if it’s in memory. The memorised patterns are in fact minimal energy states for the network (see the wikipedia entry for details).
Using this traditional method, approximately 0.14N patterns can be stored (where N = number of units/nodes) in a fully connected network. However trying to teach a network already loaded with patterns a new one results in aberrant patterns. There are ways of allowing continuous learning, the most well known probably being the Palimpsest scheme, where the weights between units are capped. This reduces the number of storable patterns down to 0.05N however. We theorise, that using the mechanism of attention allocation, we’ll be able to achieve comparable if not improved results.
The process of emulating a Hopfield network is described on the OpenCog wiki.
In my next post, I’ll explain an example of Hopfield network emulation using attention allocation.