Weblogs: Intelligent Agents
This is a long-term interest of mine that leverages the information found on the World Wide Web with end-user customisable tools that scour the web looking for information.
Intelligent Agents links
- From Lists to Social Content Engines
- Minibrowser Thoughts
- Understanding Java Server Faces
- Java Sketchbook: The HTML Renderer Shootout, Part 1
- Hotmail using C# – A HTTPMail client under .NET
- The HTTPMail protocol
- Using CSV Files as Databases and Interacting with Them Using Java
- Python development with Eclipse and Ant
- JRex - The Java Browser Component
- Dare to script tree-based XML with Perl
- SashXB lends mini-RAD to Linux
- Include GUIs in your server programming with Perl/Tk
- PHP, XML, and Character Encodings: a tale of sadness, rage, and (data-)loss
- Writing Firefox/Thunderbird Extensions
- Choosing a Templating System
Friday, May 23, 2003
I've run into a solid brick wall in regard to proxying requests. I have a working multithreaded server where each thread processes requests sequentially. The logic is neat and tidy, except for some reason Internet Explorer (the browser I've configured to use the proxy server) is very sluggish at sending requests - or should I say there seems to be a most definite pause before my server receives the next request from MSIE.
More than anything, its probably my lack of experience with multithreading and thread control, and socket programming that's largely to blame for my inability to solve this frustrating problem. There could be a number of factors that are causing this slowness:
- Spinning off a new thread per incoming connection takes a noticeable bit of time - which is noticeable given that the listening service is one of many other services
- Maybe the listening thread isn't given enough time to actually spin of the required request threads - this is where knowledge of threads would help.
- Possibly Visualage for Java 4.0 takes such a huge swag of resources that its slowing my dev-box down to a crawl - slowing down both IE and my proxy.
- I just have an extremely bad implementation.
So how do I resolve this? I can probably eliminate VAJ as the cause by running my application outside of it. Although I am logging to stdout at regular events, its possible that this may be delayed so it only looks like the problem is in the connection listener. So it would be best for me to include a timestamp for each log entry and I can determine from that if the delay is where it seems to be.
As for the threading, I'm going to have to find better resources. I've winged through threading without actually understanding it, and without an idea of how time-slices are managed. I'm guessing (hoping) that time is allocated evenly through threads by default, and this may not be the case.
One practical solution is to use someone else's proxy server, like Surfboard. Had a brief glance at that and although it is written completely in java, its using a shell script for start-up, which means it may not be all that trivial to run under windows, especially when it comes to the location of the default configuration file.
Or maybe Java is just too slow for what I need or expect.
Tuesday, April 22, 2003
All topics would be a node (or a directory), preferably named something short that's memorisable. At certain stages there will probably be duplication of topic names - so a topic would exist with this name, which then lists all the possible topics that match. For example the topic "table" shows entries for "Furniture tables" and "HTML tables" - each their own independant topics. But we would need to clarify which "table" we wanted. Naturally specifying a context would be appropriate. A context could be supplied by associating the topic name with another that relates. So "table" is matched with "HTML" for the markup variety. This association shouldn't form part of a name, but depends on the context. It should, however, be required that all topics can be uniquely named, so a representation such as "html.table" or "html/table" should be enough to distinguish it from the kitchen variety.
What sort of modifications and updates do we need to do:
- add a topic
- modify a topic
- delete a topic
- relate a topic
- unrelate a topic
- display a topic
- add a note to a topic
- add an artice to a topic
- comment on a topic
- debate/discuss a topic
- lock a topic (make it private and non-publishable)
- unlock a topic
- assign a topic to a group/user
- publish a topic
- classify a topic
Tuesday, April 22, 2003
Knowledge Representation is the age-old AI problem of mapping knowledge onto data structures. Interactive Fiction is basically a text-only adventure where you type in each instruction (or multiple instructions) that a character must perform. Every now and then I get this idea of combining a text adventure (well its interface) with a form of knowledge representation. I don't know why, and I can't place why I feel that this makes sense.
So its really about interacting with a knowledge base as if it were a text adventure. Locations become topics, for example Bruce Willis. To get to the location about Bruce Willis, teleporting there directly is a little difficult. You'd have to start out with something broad, like "Its an actor", and gradually building up more detail, "The guy from Die Hard and Armageddon". Once you get there - to the location that is Bruce Willis, you'd be able to explore the nearby locations - his career, details about his life, his marriage, related websites, interesting information.
Thankfully the more I think about this relationship between KR and IF, the more reasonable it is to see IF as nothing more than a shell in this instance - a command line. Ahh, I've been here before - when implementing a shell "API" for my proxy. Good! I was starting to worry about my mental state :-)
Monday, March 24, 2003
I had a second go at creating an HTTP proxy compliant to HTTP/1.1 specifications this weekend. My first solution was a simple single-thread type "read request, do request, send results back, loop" type structure, and it seemed to me that Internet Explorer wasn't reusing the Keep-Alive connection. It was also dire in its handling of a HTTP request, not knowing when an HTTP request was complete (from the browser's side).
The second solution was to separate the processing into two bits. One thread just reading in the HTTP request from the browser and building a simple Request Object that knew when it had all the details available. When the full HTTP request was received, it dumped the request object into a Vector. A separate thread then went through the Vector, treating it as a queue, firing off each request waiting for the response and sending that back to the browser. Seemed all and well on paper, but there's a nasty flaw - the second thread keeps polling the Vector with "any more?" requests, and for some odd reason it takes up to four seconds for an entry in a Vector to actually be seen by the thread. Hmmm. This is not good - although it worked quite nicely. It also demonstrated how Internet Explorer uses multiple connections. A load of msn.com's frontpage used up to eight connections each with anything from two to eight requests through it. The surprising thing was Internet Explorer was requesting resources from different domain names through the same connection - so much for the Keep-Alive. Maybe that means that the Keep-Alive is purely supposed to be between the browser and the proxy (of course sending the HTTP parameter Proxy-Connection: Keep-Alive is probably an obvious indicator of that!).
Java Examples in a Nutshell gives an example of a simple proxy, using two anonymous threads, but its not very bright - it doesn't look at what's coming through, so its ability to filter incoming content is severely restricted. Although, I guess I'll have to use that as a starting point and build up some HTTP intelligence. The solution is quite neat: One thread taking everything from the browser and passing it to the server, the other thread does traffic from the server back to the browser. There's no room there for the problem I had with the Vector above.
The truely awkward bit is how to determine when to close the connection? Since Internet Explorer doesn't explicitly ask it to happen. A simple solution would be a time-out after the last byte has been sent, and stopping the counter when a new request is received on a live connection. This gets a bit messy.
Thursday, March 13, 2003
I'm reading the W3C WAI Interest Group mailing list as part of my learning about accessibility. One topic which is closely related to my Intelligent Agent project is accessibility proxies that take inaccessible pages and try to make them accessible. The conversation started with an Open source project idea of "Edapta", so naturally Nick Kew's mod_accessibility got a mention. Charles McCathieNeville mentioned IBM's WBI.
WBI brands itself as a "programmable HTTP proxy" which implements a form of browser memory. So it watches the user as they browse, offering them a toolbar of options. This is exactly what I want my intelligent agent proxy to do. Similarly WBI is now currently implemented in Java, even offering a WBI Development Kit for Java which allows the user to build their own plugins. They've now coined the phrase "Websphere Transcoding Publisher". There's a good intro to WBI in the article titled "The (unofficial) WBI Story".
Where my intelligent agent presumably differs is my requirement that the process works on a handheld computer like a Sharp Zaurus - though whether that will happen I can't say at the moment.
Jorn Barger is at it again, publishing his idea called InfoRaptor. His InfoRaptor which involves keeping a word index of all pages browsed allied with master topic pages to classify the pages. Although the problem with Bargers' solution - partly to do with Barger himself - is the complete lack of semantic structure. Jorn Barger doesn't believe in structured content on the web - its an attitude typical amongst invididuals involved in Artificial Intelligence that because structured markup can't do absolutely everything, it cannot be used in any form - so he loses out on the simple benefits of structure. Of course, I look forward to evaluating his idea when its implemented, with great interest.
Thursday, February 27, 2003
I started getting to grips with the web-based service of my app last night, since the shell service won't allow editing (yet or never is undecided at this stage). Creating a web-service isn't that tricky, its simply a matter of reading and parsing the HTTP request, then sending back an HTTP status message and Content-type followed by the rest of the content.
The implementation of this webservice could be interesting. There's a few ways I could structure this. One way is to have a web service per knowledge base, all running off different ports, plus an admin one. The other is running it off one port, and just using URLs to define which knowledge base we wanted to work on. My initial impression was for the former, but on reflection the latter is more flexible. And with the URL scheme
http://localhost:8056/$driveIdentifier.$kbIdentifier/$topicName/$articleName that enforces the separation between the knowledgebases.
Since I'd need to create a "request object", it might be reasonable to use the standard HttpRequest object. Taking that idea one step further, how about the webservice class is generic and basically a servlet engine, so all the functionality is done in the HttpServlet. This would mean I could then take any generic servlet and get that running in my framework. Can following a standardised format be useful here?
I don't want my webservice doing lots of out.println() like a typical servlet. I'd much prefer a template based approach where editable fields are labelled with identifiers. So the servlet just needs to populate the identifier Hashtable. That way the servlet can focus on the logic and not worry about the presentation and overall structural aspects. In respects to the standardised servlet above trying to make a template part of the actual framework will negate the benefits of extending the standardised servlet and httpRequest structures.
Thursday, February 27, 2003
Got the raw-wiki parser working last night, so extracting the actual page text is a doddle. The neat thing about the wiki data structure is that it seems to be extensible without introducing incompatibilities. This featurette may be useful when I need to store article titles in a separate field. As wikians know, the title of a wiki page is the WikiWord, but experience shows that WikiWords is a tradeoff between readability and URL-rememberability, so having its own title field will go some way to allieviating this. Although reading a wiki file is easyish, I haven't gotten around to writing out data in a wiki format.
Getting the wiki-parsing functions to work was good, but it has put a bit of a dampener on editability using a telnet shell. I don't know if telnet can do screen wide applications like a text editor, or whether it really is a line-by-line only protocol. I guess some googling is needed. I wonder how an ssh telnet session works, since it shows a bash shell, what would happen if I then invoked a screen-based application from it -- can telnet actually handle it?
Monday, February 24, 2003
Spent a bit of the weekend implementing a virtual file system for my app. Its just a thin layer above the normal file system, but with a few useful additions. Since portability is key to this application I've organised the file structure to reflect it. At the moment all my knowledge base files are under D:/pim/ so the typical default layout is:
So this is effectively one drive with two knowledge bases (default and help). All the dbf files per knowledgebase are stored in the $knowledgebase/data directory, while the $knowledgebase/file contains directories for each topic and the meat of the content. This feels right at the moment, allowing a user to just copy out the knowledgebase directory over to another machine when required.
After the first implementation and a bit of thought it quickly became obvious that the ability to map multiple drives would be useful - especially for replication. What this also allows is knowledgebases used as raw Usenet repositories, or HTML caches to be separate from the core knowledge, so a user could have a large harddrive to store these and leave the Microdrive completely free to hold cleaned or polished information.
At the moment, a canonical way of addressing knowledge bases is by $driveIdentifier.$knowledgebase:, so any article is uniquely referenced by $driveIdentifier.$knowledgebase:$topicName/$articleName . What I thought would be useful is to have a seamless integration between wiki data and the application. So a good starting point would be using raw wiki files as the content storage. These files are essentially text-delimited Perl hashtables, but about three levels deep. The text-delimeter is however two characters, so StringTokenizer objected to that and went a little silly. Looks like I'll have to write my own parser.
Wednesday, February 19, 2003
The excellent repercussion about Chandler's discussion forums is that it sparks off those little electrons in the brain, and that has the habit of getting me thinking. Thinking is a dangerous thing. Thinking causes itches - the sort of itches that if scratched result in new Operating Systems - the good old fashioned developer's itch.
I took a few simple ideas and bundled them into a little pile of pieces, after adding more and more bits I got a finished jigsaw. Unfortunately I saw the entire jigsaw, and the picture it contained. At that point I realised there is no such product, and of course once you realise there is no spoon... damn, I should have taken the blue pill.
So the product is a knowledge management, content management, semantically structured, topic orientated, ontology compatible tool for capaturing information from email, websites and the troll-play-pen called Usenet. Also it needs to be able to take advantage of an Internet connection using Intelligent Agents to monitor updated sites looking for information that I might be interested in (based on my interest profile, naturally). It needs to provide the services of a proxy server (so it can monitor and capture notes I make about websites and newsgroup postings), plus have some sort of scheduling ability so it can fire up Intelligent Agents (to trawl the web on my behalf). The application needs to be self contained and portable, so it can run of a 340Mb Microdrive, but it also needs to be networkable, so any machine on a network can update or browse through its knowledge store.
So the design boils down to a client-server application. The client can be either a browser (so requiring a web-front end), or a specialised GUI. But there's always a fallback to a "command shell". Now the good thing about it is that since the bulk of the work is done on the server portion, the GUI can be either crossplatform application, or refined for the particular platform it is running on. So a generic client for Windows and Linux can be complemented with an optimised cut down version for the Sharp Zaurus - after all what's the point of knowlege management that's stuck indoors and cannot be consulted on the move (even better than that is using the Opera browser as the GUI!!)? Running the app off the Microdrive allows laptops to be used immediately by just slotting in the PCMCIA card. Since the microdrive is a Compact Flash type device, it just plugs into the back of the Zaurus.
The server itself needs to be a multithreaded listening on multiple ports. The services need to be configured on the fly, preferably all services need to be dynamically loaded at run time. Java excels in dynamic class loading and instantiation, so its the obvious trick to pull this off. Running different services is nothing more than listening on different ports, although the service configuration needs to be file driven, so that each "environment" that the knowledgebase runs on only runs the services it requires. The Zaurus for instance probably won't need the Intelligent Agent framework active, or the proxy server running since there's no Internet connection, and we wouldn't want the batteries to depart off to the great cell in the sky.
The tricky bit is data storage. A database is a requirement (don't know whether object databases will help here). But installing mySQL or postgreSQL is out of the question because they can't be installed on a microdrive using one laptop and still work when the microdrive is plugged into another laptop. It has to be a box independant highly portable database format. Two solutions come to mind - DBF files (harking back to the DBase days, now called the XBase format), and native java databases.
I especially like the idea of DBF databases, since you can treat them as databases in Java, but being a truely file based. From a portability sense, that is nirvana. Although being a file based database, file access times are killing. So caching is probably be a wise idea, with the drawback of memory usage.
The native java database approach is tempting and probably be preferred if the DBF solution fails. InstantDB and TinySQL both look to be good java-based implementations. The only disadvantage is that the data isn't as portable as the DBF format, so I'll need an export program (or service) to extract the data if I need to use external tools.
I'd like multiple instances of the server running, so the client can connect to the right one, but with different locations replication of data is needed. I'm designing the file structure so that all the data plus files to do with one knowledgebase is self contained under a directory structure. That allows a user to quickly copy off on directory (and all its little children) to another device as the simplest way of cloning a knowledge base. Then somehow I need to keep track of what has changed on the two databases. A timestamped key log file sounds reasonable, then just a simple "if one changed, update the other" routine which then allows the user to merge two changes if there is a conflict.
Two servers should be able to replicate over a network (need two services to handle that), but for redundancy one server should be able to take two directory structures and do a Notes type replication between them. That will be cool if it works
Acting as a proxy server seems to be the best and most transparent way to allow an Intelligent Agent framework to watch my surfing habits and build an interest representation. Plus, by running a web-server of a different port this offers ways of annotating webpages. The server needs to cache the html pages (well maybe all text/html or text/plain pages) so that it doesn't have to then retrieve the page again if I decide to annotate it. It should also keep a history of sites that I've visited, and remind me when I request a page I've already seen that hasn't modified.
Usenet is a contentious one. I need something transparent there. The best I can come up with at the moment is just to have a web based interface that allows me to paste in message-id's that I want to archive, and then annotate them via this web-based interface. A more ambitious idea is for the server to listen on the SMTP port, thus hijacking outgoing emails. Then I could just do a "reply to" which gets "delivered" to my knowledgebase and not the actual poster. This won't affect my normal email usage now since I have all my email stored online, accessed via a web-based email reader anyway.
Wednesday, February 19, 2003
Intelligent agents need to monitor sources of news. Any input stream that contains updated news. So of course RSS is the mainstay of newsworthy materials. So the challenge will be of how to map the varied RSS feeds out there and my interest profile. I suppose that since my interest profile is created and updated by the knowledgebase server, we could represent that as an XML string. Perhaps its just a topic map really, with weights and values that measure level of interest. But I probably need a method of tweaking and removing some of the more nefarious bits off it (if you take my meaning, sir).
Different knowledgebases probably need to share information and topics. So inter knowledgebase linking is needed especially between topics.
I haven't given much thought to how we get from the knowledge base to an actual publishable website. Considering the knowledgebase represents the structure of the subject, there lies the site structure. Does the application create a static html based website, or does it merely publish the content to a website framework (such as a wiki or my own web-based knowledge base (the one that will be quikref.com) ). The latter would keep things flexible, especially if the content is well structured. I'm closed to convinced that a wiki-structure is possibly the best way of storing the article content, although I probably need extra wikisims to define sections and give more flexibility to the linking syntax.
I've had this weird idea of actually making my basic shell powerful enough to run scripts. So that simple scripts could be written, stored and run on knowledge bases. What I'm proposing is for me to design an actual scripting language embedded within the server. I wonder if I should keep it simple, but if I want it to be flexible and ultrapowerful then possible embedding a language like Jython (an implementation of the python object orientated scripting language done in pure Java). I would rule out Jacl purely because of the unfamiliarity and rather simplistic but potentially powerful Tcl scripting language. It doesn't make sense to adopt a non OO scripting language in a server written in OO.
Wednesday, February 19, 2003
Things have started off rather brightly, I've got a multi multi-threaded server up and running. The services are very configurable, basically any class that extends either a ClientService, ServerService or ProxyService interface can be started as a service on any port. So I can define a config file of services to start. The server basically spins of a generic ServiceListener thread per ServerSocket, and each thread then initialises and sits there waiting for a connection. When a connection arrives the ServiceListener the then just starts a thread with the defined class to handle that connection, and it just goes back to listening.
At the moment I have a HelloWorldService that prints Hello World and disconnects, an EchoService that just echos back what is typed, and a rudimentary shell (a shell is like a Dos window, but much more powerful and flexible. Us Linux geeks just love our shells) which gives a command line access to the servers internals. The shell feels like a simplified file system, with each knowledge base being represented by a "drive", and each topic merely a directory under it. What I need to add is something to handle relationships between topics - there's no hierarchy when it comes to topic maps, so trying to enforce that is folly. It needs to be very flexible, and the relationship between topics needs to be defined at some point (just for clarity). I'm almost tempted to use a topic map to define the structure of the knowledgebase. That would save me using a "database", and I'll just have one xml file per knowledgeBase which I can load up on first access then just cache for the duration of the session.
- [26/09/2002] Rewriting the Content Stripper
- [20/09/2002] Comparing Trees
- [16/09/2002] Changing the Demo (a little)
- [16/09/2002] Shining and Polishing
- [16/09/2002] Parsing tag soup
- [12/09/2002] Progress and RDF Ideas
- [08/09/2002] Making Sense out of tag-soup
- [03/09/2002] Intelligent Agents