Posts Tagged ‘Python

04
Jan
11

Chatting Alice

Happy New Year people 🙂

Towards the extreme rear end of last year, I created a small web app, Chatting Alice, which is basically a jabber/XMPP client that allows the user of the app to chat with publicly available chat-bots on this site.

Anybody with an XMPP enabled chat client and a jabber/GTalk account can simply add chattingalice@appspot.com and start chatting…

Features:

  • A menu based interface (from within the chat) that provides options to easily change bots, and see current bot.
  • Various bots, selected on the basis of the most popular bot statistics on www.pandorabots.com
  • Each user sees his interaction as independent of other users, since chat contexts are properly maintained with different accounts for each user.
  • Chat history is not recorded, for obvious reasons 😛
  • Easy to start and stop using

Development:

I happened to be searching around for a suitable time-pass last December. I looked for some chat-bots too. There were many, but there wasn’t any easily accessible interface for them. So, I made one.. I wanted to work on App Engine, and hence the app was born.

Chatting Alice is hosted on Google App Engine, and out of the Java and python runtime choices, it is running on the python sister.

App Engine provides an XMPP service to its apps, with the facility to send invites, receive or send chat messages, and checking the availability of a user (GTalk ID). This api was used to interface with the publicly available chatbots on pandorabots. Each chat request to Chatting Alice is followed by a request to pandorabots’ server, which then sends a reply. That reply is then forwarded to the user.

The most complicated part was creating a proper system to separate chat accounts for various users, so context was not lost if 2 simultaneous users were chatting with the same bot. This was done by way of some datastore (app engine’s data storage system) entities that provide necessary details to pandorabots at the time of each request, which helps keep chatting contexts separate.

Last feature to be added was multiple bots, and the menu interface. Not that time-consuming, as python made it pretty easy to handle textual commands and errors.

Current status:

Since this was nothing but an impulsive experiment, development has been stopped. If further uses occur to me for this app at the backend of another app, maybe I’ll get back to it. BUT THE APP IS WELL AND RUNNING, WITH 10 BOTS as of now.

If you find any bugs/missing features, I’ll be happy to fix them (if possible) since as it is, I’m still learning.

Please let me know of any such issues at this Facebook page

Advertisements
23
Nov
10

My First Lexer (in Python)…

There is a course we have, Theory of Computations (actually more of formal languages), which uses the Cinderella Book as one of the recommended readings. There was an assignment for this course, which involved designing a program to separate tokens from a C++ program. After a little search-engining, I found out that such a program is actually a lexer(lexical analyzer), uses regular expressions, and that there are various automated tools to generate lexers.

But I decided to design one on my own before looking into the lexer implementation strategies in detail. So, I began with reading some stuff from “Engineering a Compiler”, and got something running. Here is the crude working code. Its immensely unproductive as far as applicability goes, but I was behind schedule in submissions, so I crooked my way through. As soon as possible, I’ll work on a better implementation.

Source Code:

Click below to read:

import re
from deftok import keywords,expressions,separators
import os
    
tokens=[]

def matchNow(line,pos,id):
    """
    checks for the regular expression of the given 'id', if it is present in rulebase
    otherwise appends it to token list
    also adds type of token to tokens list
    """
    global tokens
    ret=0
    if id in expressions.keys():
        pattern=re.compile(expressions[id])
        found=pattern.match(line,pos)
        tok=line[pos:found.end()]
        ret=found.end()
    else:
        tok=id
        ret=pos+1
    if tok in keywords:
        b='KEYWORD'
    elif id=='i':
        b='IDENTIFIER'
    elif tok in separators:
        b='SEPARATOR'
    elif id=='n':
        b='NUMERAL'
    else:
        b='OPERATOR'
    tup=(tok,b)
    tokens.append(tup)
    return ret

def checkLine(line,pos):
    """
    removes whitespace
    checks first character of line at 'pos', and calls matchNow() appropriately
    """
    global tokens
    pattern=re.compile('\s+')
    found=pattern.match(line,pos)
    if found:
        pos=found.end()
    if pos<len(line):
        curr=line[pos]
    else:
        return
    endofprev=pos
    if curr.isalpha() or curr=='_':
        endofprev=matchNow(line,pos,'i')
    elif curr.isdigit():
        endofprev=matchNow(line,pos,'n')
    elif curr in ['@','$']:
        endofprev=pos+1
    else:
        endofprev=matchNow(line,pos,curr)
    checkLine(line,endofprev)

def main(name):
    """
    responsible for file handling, and concatenation of lines with backslash-newline
    """
    fp=open(name,'r')
    temp=""
    for line in fp:
        if len(line)>1 and line[-2]=="\\":
            temp=temp+line[:-2]
            continue
        line=temp+line[:]
        temp=""
        checkLine(line,0)
    fp.close()
    fp=open('outputof'+name+'.txt','w')
    for token in tokens:
        token1,token2=token
        fp.write(token1+'\t'+token2+'\n')
    fp.flush()
    os.fsync(fp.fileno())
    fp.close()
    
if __name__=="__main__":
    import sys
    main(sys.argv[1])

Here is the definitions file. It has a list of keywords and separators, and a dictionary mapping “id’s” with regular expressions…

#list of keywords
keywords=['auto', 'const', 'double', 'float', 'int', 'short', 'struct', 'unsigned', 'break', 'continue', 'else', 'for', 'long', 'signed', 'switch', 'void', 'case', 'default', 'enum', 'goto', 'register', 'sizeof', 'typedef', 'volatile', 'char', 'do', 'extern', 'if', 'return', 'static', 'union', 'while', 'asm', 'dynamic_cast', 'namespace', 'reinterpret_cast', 'try', 'bool', 'explicit', 'new', 'static_cast', 'typeid', 'catch', 'false', 'operator', 'template', 'typename', 'class', 'friend', 'private', 'this', 'using', 'const_cast', 'inline', 'public', 'throw', 'virtual', 'delete', 'mutable', 'protected', 'true', 'wchar_t']

#list of separators
separators=['(', ')', '[', ']', '{', '}', ',', ';', ':']

#rulebase
expressions={   '+':r'\+[+=]|\+',
 '-':r'-[-=>]|-',
 '!':r'!=|!',
 '*':r'\*=|\*',
 '/':r'/=|/',
 '%':r'%=|%',
 '<':r'<<=|<[<=]|<', '>':r'>>=|>[>=]|>',
 '&':r'&[&=]|&',
 '^':'^=|^',
 '|':r'\|[|=]|\|',
 '=':'==|=',
 'i':r'[A-Za-z_]+\w*',
 'n':r'[0-9]+\.[0-9]+|[0-9]+',
 '.':r'\.[0-9]+|\.'}

Problems with this program:

  1. It is ugly.
  2. Tokens like “9xyz” are accepted as valid identifiers. Change in regular expressions required.
  3. Apart from the regular expression rulebase, not much of the code is customizable. Eg: classification of identifiers is rigid, formatting in output file is precoded; in fact, even the rulebase is forced to work according to the inbuilt functions: checkLine() and matchNow()
  4. Invalid tokens need to raise an exception, not included here.
  5. Possibly, the entire lexer should be packaged in a class, which is iterable over itself. Then, its instance can be utilized as <instancename>.next()

There might be other issues as well. After I get over my exams, lets see if I get a better version of this rudimentary lexer up and running.

Namaste.




Top Clicks

  • None
Advertisements