The Next Big Thing: Personalised Data-Mining
May 22nd, 2008Recently, I’ve been creating a number of web crawlers that gather various things together an do nice things with the information collected and had one of those “aha” moments….
The “Next Big Thing” on the internet will be Personalized Data-Mining. “What’s that?” you ask. “It’s really very simple…” I tease … “Go on then” you prompt… “OK, calm down, calm down… here we go…” I witter…
All Personalized Data-Mining is doing what I’ve been doing, creating web crawlers that gets information and does nice things with it. You see, the trouble with Google is, you can’t ask it difficult questions, like “Get me every TV that is less than ten inches wide”… Now the silly part is that all the information you need is probably sitting on Amazon and John Lewis and Googlebase, but frustratingly, you can’t quite get at it.
So, to pin it down, Personalized Data-Mining is about empowerment. The data is there but you can’t get at it, all Personalized Data-Mining does is turn it from something-you-can-read into something-you-can-use.
And here’s the important point, Personalized Data-Mining is NOT SEMANTIC WEB… it’s the iterim (and necessary) stage that needs to happen before the semantic web can happen. You see, the problems that dog the semantic web are many (and blogged about before) but the biggest problems are…
- Not everyone will semanticize their information
- Those that do semanticize will do it badly (trust me) meaning you still can’t use it the way you want to anyway
The solution to these problems is Personalized Data-Mining, which has the benefits of…
- Being distributed
- Being small.
- Being creatively in the hands of the individual, rather than people who decide what a “shoe” or “garden plant” is made up of. Being personalized means real people get to bend it, shape it and invent new applications.
- Being able to “get at” ANY online information, whether it’s in html, images or even PDF.
Currently there are a number of sites and services that almost get there, that automate data-extraction, that help make mashups, but for me… none quite do the thing required, which is to pull the data into a context where you can manipulate your “world”… they quiclly turn “data” back into “information”. I don’t want a data-mining report… I want well, data and ways of dealing with that data.
“What the hell are you on about?” you rightly ask, sighing…. “Well… ” I dream on…
Imagine an internet where all the sites out there are in fact all working for you. You can have them take this site and mix it with that database, and they’d do it willingly. You can request features and have them implemented by this afternoon. You can ask questions and get them answered, no matter how crazy. You can create new businesses based on other businesses and not be sued whilst doing so.
Or to put it in simpler terms… you can do stuff. At times the internet seems to be all about having stuff done for you. And if you want something done well…

