[alicebot-archcomm] Conditional wildcards

Gary Dubuque alicebot-archcomm@list.alicebot.org
Sun, 2 Mar 2003 09:27:39 -0800


Well then the definition of wildcard is...

"Matches one to many words except if immediately followed by another
wildcard in which case it only matches one word."

Is this the simple explanation we propose for the AIML standard?  I ask for
nothing more than that clear understanding to be recorded.

Thanks,
  Gary Dubuque
  Seeker of AIML statures

-----Original Message-----
From: alicebot-archcomm-admin@list.alicebot.org
[mailto:alicebot-archcomm-admin@list.alicebot.org]On Behalf Of Dr. Rich
Wallace
Sent: Sunday, March 02, 2003 9:29 AM
To: alicebot-archcomm@list.alicebot.org
Subject: RE: [alicebot-archcomm] Conditional wildcards


I respectfully disagree.  The behavior of multiple wild card sequence is
well defined by the matching algorithm.

Program M says,

--- Matchai: Find the matching category for a client input
--- by searching the Graphmaster graph.

-- The argument binding is a vector of three elements:
-- The template, an array of star bindings, and the maching path.
-- For example:
--    result := [template, stars, matchpath];
--    Matchai(ROOT, path, 1, result);
-- places the detected template, stars and binding in the
-- 3-tuple called result.

--- The final implementation uses the filesystem to store
--- the graph.

proc Matchai(node, path, index, rw binding);
  [template, stars, matchpath] := binding;
  if index > #path then
     if fexists(node+'/template') then
        template := getfile(node+'/template');
        matchpath := node;
        template := Trimai(template);
        binding :=  [template, [], matchpath];
        return(true);
     else return(false);
     end if;
  end if;
  word := path(index);
  if fexists(node+'/under') and
      exists i in [index+1..#path+1] |  Matchai(node+'/under', path, i,
binding)
  then
     phrase := flatten(path, index, i-1);
     stars := [phrase]+stars;
     binding :=  [binding(1), stars, binding(3)];
     return true;
  elseif fexists(node+'/'+word) and
         Matchai(node+'/'+word, path, index+1, binding)
  then
     return true;
  elseif fexists(node+'/star') and
     exists i in [index+1..#path+1] |
         Matchai(node+'/star', path, i, binding)
  then
     phrase := flatten(path, index, i-1);
     stars := [phrase]+binding(2);
     binding :=  [binding(1), stars, binding(3)];
     return true;
  else
     return false;
  end if;
end proc;

and the key section to answer your question is the condition:

  elseif fexists(node+'/star') and
     exists i in [index+1..#path+1] |
         Matchai(node+'/star', path, i, binding)
  then

which you could read as

 else if there exists a branch with a "*" from this node and
   there exists a number i in [index+1, #path+1] such that
     a Match exists rooted at the child of this branch,
       from the ith subsequent word in the input path.

Hence, given the two patterns

HEY * and
HEY * * and the input HEY X Y

Initially index=1.  #path=3.

The word HEY finds an identical word match according to the second
matching rule,

  elseif fexists(node+'/'+word) and
         Matchai(node+'/'+word, path, index+1, binding)
  then

so the Matchai procedure is called recursively with the graph
rooted after HEY and the value index=index+1 = 2.

Given the remaining input X Y

Does there exist i in [2..4] such that a match exists in the subgraph
rooted from the star branch?  It is important to note that that the
algorithm looks for the i in the order 2, 3, 4; not the other way around
(nor in any other order).  Thus the match involving just bingding the X to
star by itself, would be tried first, and the Matchai procedure should be
called recursively again, but this time on the child node of the graph,
i.e. the one rooted after the second star.

The third and penultimate call to the matching algorithm has

index=3 and the subtree rooted in the second star of the pattern HEY * *.
Predictably the Y matches the second star.  There is one final call to the
Matchai() procedure for the case index=4 > #path, which finds the
associated template (omitting the steps for <that> and <topic> match)

The matching algorithm is an effective procedure for disambiguating
matches in AIML pattern language.  It is basically just three simple
cases, which have been written about in detail.  Though it may be
painstaking to work through the examples involving multiple wild cards by
hand, the programs do exist and they ought to all produce the same
matching result.


> I propose that patterns like * * be declared undefined and unpredictable
> in the official AIML standards.  That explicitly they can behave
> differently in various engines.
>
> I do not see a difference in the results of matching "HEY *" and "HEY *
> *". I therefore propose that if the need is there, an attribute be added
> to the <star> for specifying which word the template needs to access.
> This follows the philosophy established by <srai>, that syntax is
> performed in templates and not in patterns.  To be certain, <star>
> processing is a template side function and not a pattern side artifact.
>
> Considered,
>   Gary Dubuque
>   Becoming clear on AIML
>
> -----Original Message-----
> From: alicebot-archcomm-admin@list.alicebot.org
> [mailto:alicebot-archcomm-admin@list.alicebot.org]On Behalf Of Dr. Rich
> Wallace
> Sent: Sunday, March 02, 2003 2:04 AM
> To: alicebot-archcomm@list.alicebot.org
> Subject: Re: [alicebot-archcomm] Conditional wildcards
>
>
> Well stated Jon.
>
> It is worth pointing out that, the AIML pattern language + matching
> algorithm has the very nice, simple property that there is a one-to-one
> correspondence between inputs and matching templates.  The more complex
> and ambiguous the pattern language becomes, the less clear this
> one-to-one map becomes.   Adding "sets" to the patterns raises problems
> of ambiguous matches, multiple categories matching the same input, and
> deciding the matching priority of these ambiguous unifications.
>
> When the AIML set becomes large, the one-to-many matching creates a
> scaling problem.  Every new category introduced potentially upsets the
> delicate balance of order of the existing categories.  Certain
> commercial bot systems in existence are known to have this scaling
> problem for precisely the same reason: their bot languages permit more
> general, and hence more ambiguous, matching of inputs to responses.
>
> If you keep the pattern matching languge simple, as we have in AIML, the
> one-to-one matching property is not lost.  So the same scaling problems
> do not arise.  It is quite clear in AIML how every new pattern or
> category fits into the encyclopedia of knowledge already in the robot's
> brain. There is no cross-referencing or indexing necessary to determine
> when a particular category will be activated, and no new category will
> upset the order of matching among the pre-existing ones.
>
>
>
>>> This version of AIML would work if you wanted to record every number
>>> possible.  Already we have many categories for just entering
>>> someone's
>> age.
>>> The idea is that these are classes of items to match.  Classes that
>>> don't fit into the realm of just natural language. Surely you're not
>>> saying that entering all those dates would simplify the processing of
>>> them.  I would image the minimalist would say no, no, no, we want it
>>> simple.  Just let us call a date a date.  Don't make us list all the
>>> possible dates there are.
>>
>> Who says you have to? You don't have to do that, just to make use of a
>> class. There are ways to find out what kind of class (if any) that a
>> word is after it has been matched to a wildcard when being processed
>> in the template.
>>
>>> So how are we going to know when someone is going to use a date as an
>>> answer?  One has to ask deliberately for a date or expect that only a
>>> date is appropriate from the context of the inputted phrase.  This
>>> leaves alot
>> of
>>> wiggle room.
>>
>> Only if you let there be wiggle room.
>>
>> Such things as conditional wildcards have been attempted before. There
>> was the typeof tag Jon Baer experimented with, and maybe not quite as
>> known was creating sets, and specifying a set (a restricted wildcard,
>> using @ syntax).
>>
>> There are problems with both.
>>
>> Typeof has the problems of how to expand the tag .. does it occur
>> while processing the template, or does it get expanded into multiple
>> categories at load time? Also, what happens with two or more instances
>> of the same typeof in both the template and pattern? Such as:
>>
>> <pattern><typeof_a/> * IS <typeof_a/> *</pattern>
>> <template>I didn't know <typeof_a/> <star/> is <typeof_a/> <star
>> index="2"/></template>
>>
>> And <typeof_a/> is defined to be "a" and "an". How do you know which
>> is which in the template? Do you instead treat matching to a typeof as
>> an additional wildcard, so it'd instead be:
>> <template>I don't know <star/> <star index="2"/> is <star index="3"/>
>> <star index="4"/></template>
>>
>> If you do, this would imply the category doesn't get expanded at load
>> time.
>>
>> And the @ syntax has the same problem as your conditional wildcard
>> idea. How do you decide among two plausible matches?
>>
>> What if you have @NUMBER and @PHONE? @PHONE would just be a more
>> restricted form of @NUMBER but the interpreter doesn't know that.
>>
>> I do believe that <typeof> could become a possible solution .. it
>> would just require a bit of work to establish exactly how it would
>> work. But it is still susceptible to the same problem as these others.
>>
>> So I guess the real question is: is the possibility for ambiguity such
>> as this acceptable?
>>
>> Jon =)
>>
>>
>> _______________________________________________
>> alicebot-archcomm mailing list
>> alicebot-archcomm@list.alicebot.org
>> http://list.alicebot.org/mailman/listinfo/alicebot-archcomm
>
>
> --
> Dr. Rich
> W A L L A C E
> ALICE A.I. Foundation
> drwallace@www.alicebot.org
>
>
>
>
> _______________________________________________
> alicebot-archcomm mailing list
> alicebot-archcomm@list.alicebot.org
> http://list.alicebot.org/mailman/listinfo/alicebot-archcomm
>
>
> _______________________________________________
> alicebot-archcomm mailing list
> alicebot-archcomm@list.alicebot.org
> http://list.alicebot.org/mailman/listinfo/alicebot-archcomm


--
Dr. Rich
W A L L A C E
ALICE A.I. Foundation
drwallace@www.alicebot.org




_______________________________________________
alicebot-archcomm mailing list
alicebot-archcomm@list.alicebot.org
http://list.alicebot.org/mailman/listinfo/alicebot-archcomm