[alicebot-general] J-Alice 0.5 - security risk by learning system calls

mehri foreverlinux at yahoo.com
Wed Aug 2 22:19:34 PDT 2006


Well, there's lots of security issues when putting any AIML bot of any type out there.  With your problem below I would say you would want to solve that issue directly in the code and not try a workaround of any type.


As for J-ALICE in general, it'll be a while before I write the compatibility layer for Rebecca and swap out the J-ALICE engine with Rebecca.  However, once I do that you won't have punctuation mark problems.

However, in the meantime, you could always write the layer yourself or fix the original J-ALICE base line to fit your needs.  They're both open source to allow this and Rebecca's heavily documented.

The truth, though, is it's still going to be quite a while before I even begin swapping out the J-ALICE engine for Rebecca's and write the IRC xml layer.  I still have quite a to-do list before getting to it and only have my spare time to do it.

But it's not as if I'm doing nothing.  I'm doing quite a bit of work on Rebecca.  I'm just trying to get her to a specific point before swapping out the J-ALICE engine for her.

 ----- Original Message ----
From: kolya <kolya at schwarzsilber.de>
To: alicebot-general at list.alicebot.org
Sent: Wednesday, August 2, 2006 2:02:36 PM
Subject: [alicebot-general] J-Alice 0.5 - security risk by learning system calls

I run J-Alice by Jonathan Roewen from my winXP desktop.
http://sourceforge.net/projects/j-alice

She connects to an IRC channel where I usually go. And when people say 
her name or /MSG her, she talks with them.
Now I wanted J-Alice to learn through input of IRC chatters.
I tested the "badanswer.aiml" that's floating around the net.
But soon I realised it's designed for Pandora bots only.
So I wrote my own simple learning.aiml that looks like this:


<category>
<pattern>WRONG</pattern>
<template>
<think>
<set name="question">
<uppercase><input index="2"/></uppercase>
</set>
</think>
What should I say?
</template>
</category>

<category>
<pattern>*</pattern>
<that>WHAT SHOULD I SAY</that>
<template>
<think>
<set name="answer"><input/></set>
</think>
<learn>
<pattern><get name="question"/></pattern>
<template><get name="answer"/></template>
</learn>
Learned.
</template></category>

We had a lot of fun with this on IRC until I realised (fortunately being 
the first to do so) that something like this could easily happen:

HACKER: Annoy Kolya
BOT: Blablah.
HACKER: wrong
BOT: What should I say?
HACKER: <system>explorer</system>
BOT: Learned.
HACKER: Annoy Kolya!

At that moment an explorer window pops up at my desktop. Really.
With a similar command my HD might get deleted.
J-Alice has basically become a trojan horse. Well of course it was my 
own fault for writing this learn.aiml, I know.

I still want her to be able to learn but of course not how to hack my 
computer... :/
So I tried "catching" answers with the word "<SYSTEM>" or 
"&lt;SYSTEM&gt;" in AIML but that didn't work because the word doesn't 
stand alone.
And the possibilities of commands that someone can put between <system> 
and </system> are endless.
Same problem occured with replacing it in the substitutions.xml.

Any ideas how to make learning more secure would be very appreciated!

Another minor problem I ran into is that the initial user input in 
learning may not contain a punctuation mark for some reason. I don't 
know why this is so, but it would of course help when people could just 
talk freely and then correct her instead of making "preformatted" 
utterances for the bot.

Kolya

_______________________________________________
This is the alicebot-general mailing list
Reply to alicebot-general at list.alicebot.org
Unsubscribe and change preferences at http://list.alicebot.org/mailman/listinfo/alicebot-general
Learn netiquette at http://www.dtcc.edu/cs/rfc1855.html
Learn to read at http://www.literacy.org/






More information about the alicebot-general mailing list