Wednesday, July 06, 2011

Traits, Duck Typing and Dynamic Semantic Learning

Duck typing is the ability to say that something implements an interface. In this article I'll focus on triples (sets of key value pairs) such as Maps in Java or Dictionaries in other languages - <Map-instance, key-obj, value-obj>. I will outline how the concepts of traits could be used in Drools to infer semantic abstractions over sets of triples, which allows for dynamic semantic learning over time. The terms map and triple set will be used interchangeably.

Duck Typing: http://en.wikipedia.org/wiki/Duck_typing
"When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck."

Triples: http://en.wikipedia.org/wiki/Resource_Description_Framework
"the form of subject-predicate-object expressions. These expressions are known as triples in RDF terminology. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. For example, one way to represent the notion "The sky has the color blue" in RDF is as the triple: a subject denoting "the sky", a predicate denoting "has the color", and an object denoting "blue"."

Duck typing over triples is the ability to say that the instance that represents the set of triples can be treated like an instance of a interface. This allows static type safe access to dynamic triple structures, it also allows abstraction through semantic representation of what that thing is; i.e. it's not just a set of arbitrary triples, it is a Student. MVEL does not yet support the "dons", as in "wears" keyword, so please take this as illustrative. The keyword may change eventually but it was proposed by Davide Sottara, who is POCing this idea.

I'll use MVEL like syntax to demonstrate:
bean = [ name : "Zod", age : 150 ]

bean dons Human
assertEquals( "Zod", bean.name )
assertEquals( 150, bean.age)
Without the don's keyword to duck type the map to the interface this would not compile as the compiler would report a static typing error of unknown fields for “name” and “age”.

Now that we know duck typing can be used to allow static type safety access to map. What else can we do? In a rule based system if we used triples to represent facts (which is what semantic ontologies do) we can't declare up front what interfaces a map wears, and those interfaces might change over time too. So we can use special rules to dynamically apply traits to a triple set.
rule Human when

$tp : Map( this contains [ "name" : String, "age" : int ] )
then
$tp dons Human// that $tp instance is now recognised by all other rules that match on Human
end

rule HelloWorld when
$s : Human() // this will actually match against the Map "donned" to Human
then
println( "hello " + $s.name );
end
We can see the rule that applies the trait can probably have a first class representation for it's use case. Which makes the rules intent far more obvious thus increasing the readability and maintainability of the system.
trait Human( String name, int age ) when

end
In the above, "trait" is a new keyword and Human is the trait name. We pass all the fields and their type as arguments. The triple set must contain at least those keys, but of course it may contain more. Notice we have an empty "when" block. The reason for this is we can apply different logic as to when a trait is applied, beyond just matching known keys to fields.

For instance if someone is Human and is also 18 years of age or younger we can apply a further abstraction and say the are not just Human but also a Student. We use the "dons" keyword after the arguments to say the existing traits the Map must already wear, i.e. abstractions we already know about the thing.
trait Student( String name, int age ) dons Human  when

age( < 18; ) end
The proposed syntax would allow argument names to be used as the pattern head and the type is inferred. We could also allow operators to be used in the positional syntax. This is to give compact sugar for "Integer( this < 18 ) from age".

So now we have a system to detect and recognise sets of triples and declare what traits they have; what abstractions we infer for them. As the system learns new things keys may be added to the map and new abstractions can be inferred by declaring more traits,which in turn allows further reasoning. Keys may also be removed which results in traits being removed.

One of the problems of a purely tripled based approach is performance, both in terms of execution speed but more important memory usage. If "name" and "age" both have to be represented as objects the system is going to bloat fast. What we want is to allow a mixed type system of static and dynamic relations. The relations are what we refer to for each of the key/value pairs in the triple set, i.e. a property (bean getter and setter pair, normally on a member field) is a relation on a class.

When a normal bean is inserted we will know it don's all the interfaces it implements and thus all the properties those interfaces declares. When accessing those properties we will do so via the standard getter or setters. This means properties we know up front and that don't change, can be declare using standard java fields with getter and setters. Allowing quick access and low memory utilisation. However we will allow further relations (triples) to be associated with the instance, as “dynamic” properties.

The specialised 'trait' rules will uniformly detect existing static properties or dynamically added properties. It's important to remember that a trait is a runtime applied interface to given instance, and just that instance. Bean instances of the same concrete type can wear different traits at any given time. Except of course for the statically declare interfaces on the concrete implementation.

Lets work through a complete example now. Human is a type declaration which is generated as an actual class from which beans can be initiated, “name”, “age” and “gender” are static relations. Young, Boy and FussyEater are all interfaces. Human extends TripleSet so that we know that further dynamic relations can be added and traits applied. We detect the bean instance is “< 18” and thus the trait Young is applied and that if the gender is M the trait Boy is applied. Further if a property exists, either static or dynamic (the two are seamless in the syntax) called “dislikes” with a value of “carrots” we apply the FussyEater trait.
declare Human extends TripleSet

String name;
int age;
Gender gender; // M/F enum
end

trait Young(int age) dons Human when
age( < 18; )
end

trait Boy(Gender gender) dons Young when
Gender( Gender.M; )
end

trait FussyEater(String dislikes) dons Boy when
dislikes( “carrots”; )
end
Now that we have a system that can detect and declare fussy eaters, lets use it. First declare a person who is 16, that will be an actual bean instance. Then add the dynamic property “dislikes”. Finally insert a new command to give that person some ice cream.
// Lets declare a new triple for a given bean instance that we instantiated from Human

Human human = new Human( “Zod”, 16 )
human.add( [dislikes : “carrots”] )
insert ( human )
insert( new GiveIceCream( human ) );
We can now have a single rule that disallows fussy eaters from getting ice cream. How cool is that :)
rule “Don't give icecream to boys who are fussy eaters”

$f : FussyEater()
$d : GiveIceCream( $f; )
then
retract ( $d )
end
Because traits can be applied conditional on facts and facts can be logically inserted to be maintained by the truth maintenance system, that means we can have traits who existence is dependant on those logical insertions. When the series of premesis that creates the chain of logical insertiosn breaks the trait depending on it will be un-applied to the instance. See this previous blog for more details on TMS "Drools Inference and Truth Maintenance for good rule design and maintenance".

10 comments:

  1. Will the boy loose his trait of being a FussyEater as soon as he no longer dislikes eating carrots (so does full TMS apply)?

    ReplyDelete
  2. "Will the boy loose his trait of being a FussyEater as soon as he no longer dislikes eating carrots (so does full TMS apply)? "

    yes, blog updated to point that out.

    ReplyDelete
  3. Given your example, you would not be able to instantiate "Human" using "new" as it (presumably) exists only as a declarative type.

    I assume if the model was a POJO you could either extend "TripleSet" from Java or as an extension in DRL? i.e. Add "declare Human extends TripleSet" similarly to the way you add "@role(event)" to existing POJO models for CEP.

    Wouldn't it make the whole thing more seamless if, as a user, I could be blind to the need to extend TripleSet in the first place?

    ReplyDelete
  4. "Given your example, you would not be able to instantiate "Human" using "new" as it (presumably) exists only as a declarative type."
    You assume wrong :) a rule's consequence can instantiate Human as a bean no problem. I didn't think it was necessary show an "init" rule.

    "I assume if the model was a POJO you could either extend "TripleSet" from Java or as an extension in DRL?"
    Yes this could work with existing pojos that extend TripleSet and a type declaration could add that behaviour to existing pojos that don't implement TripleSets. I suspect we will make TipleSet invisible in the end and it will just work automagically if you associate a trait to a pojo. That would depend if the user wants easy access to the triples from java too, in which case extending TripleSet would be easier. So we have a few ways to dice this.

    ReplyDelete
  5. Thanks for the clarification, I hadn't appreciated the "Human" instantiation was in a consequence. I had also thought about the automagification so good to read. BTW, is a "TipleSet" (typo in your reply) equivalent to buying a round of drinks ;)?

    ReplyDelete
  6. what is the advantage of using traits instead of let's say add a boolean fussyeater in a class Boy and check it in the rule?

    ReplyDelete
  7. "what is the advantage of using traits instead of let's say add a boolean fussyeater in a class Boy and check it in the rule?"

    A trait provides a single keyword that can represent multiple field assertions. Further to that it is compositable, and thus one trait can depend on another trait.

    If you had to repeat all the fields constraints that determine if a given instance is something it becomes much harder to read and maintain. Plus you give a collection of field constraints semantic meaning via encapsulation, which makes it more readable too.

    ReplyDelete
  8. Interesting idea, but type or trait inference relies on us knowing that the only 'type' or 'trait' in the system that has both an age and a name is a Human. It falls over once another type has similar properties.

    For example, a bridge, a rabbit or a school. All of these have names and ages. None of them can be students. One of them loves vegetables, will probably be less than 18 years old and will get very sick if you feed it ice-cream.

    ReplyDelete
  9. Well, this example is a bit oversimplified, it's just to give an idea of what we'd like to do. Indeed, if you define "Human" as equivalent to "having some name and some age" you can get odd inference results :)
    A more sound definition of "Human" could be "has 2 legs, 2 arms, 1 head, etc...". Then, you could define "Student" as "Human and goes-to-a-School and ...". We're actually working on how to import those definition from a proper description logic

    ReplyDelete
  10. At the moment our TMS only does premesis, but wew will be adding contraditions soon. Along the lines of the work described here:
    http://www.cis.temple.edu/~ingargio/cis587/readings/tms.html

    We are also looking into defeasible logic:
    http://defeasible.org/

    Systems like the above are designed to help address those sort of issues.

    ReplyDelete