Nov 26th 2003:
The first data set is ready! Thank you for understanding and patience.

Nov 21st 2003:
After some misunderstandings in a thread on forum, we felt important to stress that we're researching and developing theory and code for an open source search engine and we're not about to provide a public web service like e.g. Google. Thus all the data we collect now is used solely to evaluate and extend our models, not to make it public nor distribute it in any form. Just as stated before.

Some clarifications were made to the FAQ to make this position more clear.

Nov 18th 2003:
We have started to collect a test data set to evaluate our statistical models. Thus if you've met an Irchiver bot recently, you can be sure that THE LOGS WILL BE HANDLED CONFIDENTIALLY. Not a bit from the logs will go in public. However, the code will be released under an open source license.

If the models seem to work we might want to collect another data set for a public demo. But it will then be a totally different thing.

Oct 30th 2003:
After a fruitful discussion in we decided to provide a FAQ with a humble suggestion how we would like to proceed.

Feel free to visit #searchengine channel at to discuss with us.

Irchiver is a harmless IRC bot. Its sole purpose is to silently collect data for research purposes. Irchiver doesn't answer to any queries or send anything to channels. The data will be used by HIIT's Complex Systems Computation Group in information retrieval and language modeling research. Eventually we will publish systems and tools used in research under GPL (see our search engine project) which will hopefully benefit every IRC user. Please understand our need for real-life data.

Naturally we respect privacy and freedom of speech of every IRC user. Data will be processed using various statistical models which try to look the big picture, not individual sayings and actions.

We hope that Irchiver could follow IRC discussions as widely as possible, so please let Irchiver stay on your channel. If you necessarily must get rid of Irchiver, feel free to /KICK it or ban it from your channel (Irchiver comes always from You may also ask us to blacklist your channel by sending mail to the address below. This will ensure that Irchiver won't bother you again.

Please give us feedback or report Irchiver misbehavior by sending mail to

This page was last modified 26.11.2003. We will provide more interesting information & statistics here later on.