[alicebot-general] J-Alice 0.5 - security risk by learning system calls

kolya kolya at schwarzsilber.de
Wed Aug 2 13:02:36 PDT 2006


I run J-Alice by Jonathan Roewen from my winXP desktop.
http://sourceforge.net/projects/j-alice

She connects to an IRC channel where I usually go. And when people say 
her name or /MSG her, she talks with them.
Now I wanted J-Alice to learn through input of IRC chatters.
I tested the "badanswer.aiml" that's floating around the net.
But soon I realised it's designed for Pandora bots only.
So I wrote my own simple learning.aiml that looks like this:

<!-- LEARN! -->
<category>
<pattern>WRONG</pattern>
<template>
<think>
<set name="question">
<uppercase><input index="2"/></uppercase>
</set>
</think>
What should I say?
</template>
</category>

<category>
<pattern>*</pattern>
<that>WHAT SHOULD I SAY</that>
<template>
<think>
<set name="answer"><input/></set>
</think>
<learn>
<pattern><get name="question"/></pattern>
<template><get name="answer"/></template>
</learn>
Learned.
</template></category>

We had a lot of fun with this on IRC until I realised (fortunately being 
the first to do so) that something like this could easily happen:

HACKER: Annoy Kolya
BOT: Blablah.
HACKER: wrong
BOT: What should I say?
HACKER: <system>explorer</system>
BOT: Learned.
HACKER: Annoy Kolya!

At that moment an explorer window pops up at my desktop. Really.
With a similar command my HD might get deleted.
J-Alice has basically become a trojan horse. Well of course it was my 
own fault for writing this learn.aiml, I know.

I still want her to be able to learn but of course not how to hack my 
computer... :/
So I tried "catching" answers with the word "<SYSTEM>" or 
"&lt;SYSTEM&gt;" in AIML but that didn't work because the word doesn't 
stand alone.
And the possibilities of commands that someone can put between <system> 
and </system> are endless.
Same problem occured with replacing it in the substitutions.xml.

Any ideas how to make learning more secure would be very appreciated!

Another minor problem I ran into is that the initial user input in 
learning may not contain a punctuation mark for some reason. I don't 
know why this is so, but it would of course help when people could just 
talk freely and then correct her instead of making "preformatted" 
utterances for the bot.

Kolya



More information about the alicebot-general mailing list