Where does data mining succeed, and why?
As previously noted, I have a Computerworld column coming out next week on data mining. The heart of the column is an enumeration of markets where data mining applications were having genuine success. Before I sat down to actually write the column, my list went something like this:
- There’s a large set of “early warning” apps where text mining is being deployed. Many of those same apps are addressed by data mining of tabular data too – antifraud, to start with, and also warranty tracking and indeed most of the rest.
- Data mining has been huge in CRM.
- The use of data mining in manufacturing to do failure analysis, improve quality, etc. is really on the rise. This goes at least somewhat beyond what one could reasonably pigeonhole as “early warning.”
- Data mining plays a big role in the life sciences, and is being applied to a broad range of other sciences as well.
- Data mining is a huge part of R&D at search engine and antispam vendors.
By the time I submitted the column, the list had morphed into:
- Customer offer targeting.
- Other CRM applications, often of text mining, such as reputation management or just sentiment tracking.
- National security, antifraud, and crime prevention.
- Purer portfolio/risk management applications.
- Defect tracking.
- Health care and scientific research.
For lots of examples and explanation of the categories, please see the column when available. (Theoretically that should be on the inauspicious date of September 11. In practice, it could be any time next week. I’ll post a link here when I know of one that works.)
While the latter version of the list may be slicker and more precise, which is why I went with it in the column, I think the former is more useful for a discussion of why those particular apps are the ones that get adopted. Simply put, data mining apps are concentrated at two extremes:
- Seeking “gold nuggets” of insight.
- Continuous process improvement.
What’s more, if I had to pick just one of those categories, I’d pick #2. The annals of BI are replete with examples of insights that just leapt out of reports and danced straight to the bottom line. But those stories are generally about reports and OLAP analyses, not full-blown statistical workups. Don’t get me wrong — I’m sure there are plenty of cases of data mining producing hugely valuable sudden insights. But, uh, I can’t think of any right now, at least not in the mainstream statistical analyses we usually think of when we hear “data mining.” (Perhaps some kindly product vendors will help me out with examples. If nothing else, there should be examples in the life sciences, forensics, product quality, etc. – i.e., in applications where there only ever was one single answer to discover in the first place. )
Where data mining does succeed all the time is in areas such as marketing efficiency improvement – mailing smarter, better targeting customer offers, and of course avoiding “bad guy” customers such as fraud or default risks in the first place. Text mining is something of an exception to that rule – but then, despite its name, it’s not clear that all of text mining should be classified as data mining anyway. Some of it is just “knowledge/fact/information extraction”, which generally is used to inform analytic technologies of some sort or other. But those can be regular BI or text search or whatever, with data mining just being one of the candidates on the fact-consumer-technology candidate list.
Comments
8 Responses to “Where does data mining succeed, and why?”
Leave a Reply
[…] In a couple of recent posts about data mining, I referenced a Computerworld column due to run September 11. Wonder of wonders, they got it posted on the very first day. Here’s a link. • • • […]
James Taylor attempted to post this comment but for some reason failed. So I’m doing it for him.
I’m doing this from vacation on Grand Cayman, after a hard day of snorkeling (revelation of the day — “beautiful squid” is NOT an oxymoron), so please forgive my lack of effort to fix word wrap and any other formatting issues. (But at least I fixed the typo in one of the URLs …) CAM
———————————————————-
Curt
Think you are completely correct on this one – the ongoing management
and improvement of decisions using data mining is where the money is.
One of the challenges this raises is how to “operationalize” the insight
that comes from data mining. One way is to mine the data for business
rules and operationalize those and another is to mine the data so as to
produce executable predictive analytic models.
I have written about a one-time immediate improvement
( http://www.edmblog.com/weblog/2005/10/customer_segmen.html ) but more
have the kind of ongoing success you discuss. There’s a lot of confusion
around data mining, analytics, predictive analytics and so on so it
comes up a lot on my blog at
http://www.edmblog.com/weblog/data_mining/index.html.
One last thing – this poll at KD Nuggets was fun
http://www.edmblog.com/weblog/2005/08/kdnuggets_succe.html.
[…] Finally, there is hardcore data crunching. Data mining fits that bill, but so does heavy SQL-only data exploration (aka “The Query That Ate Pittsburgh”). This is where a small number of expert users extract value from massive data stores. Scheduled reporting can also fit into this category at aggressive enterprises. Here is where the high-end data warehouse vendors – e.g., Teradata, IBM (mainframe DB2), and the data warehouse appliance startups – really shine. At smaller enterprises, other kinds of data stores also suffice. I have a careful list (two versions of the same list, actually) of data mining app categories over on the Monash Report. It’s a good start on a list of apps for this whole category. […]
[…] Nor does this change when the warnings are the product of text or data mining. For example, despite a very interesting approach to generating alerts, at this point in its development Verix delivers them in uninspired ways. […]
[…] Data mining and predictive analytics are mainly information access plays. Yes, the information being accessed is calculated rather than raw. Yes, I believe that the heart of the data mining market is continuous process improvement. Even so, what users buy from the vendors is usually little more than information toolkits. […]
[…] 2006 I rattled off a long list of early-warning uses for text analytics. The same year I discussed application areas for data mining and came up with a list much like the one in this post — lots of early-warning or other […]
[…] that one might think of as having to do with artificial intelligence – e.g. expert systems, predictive analytics* and text analytics — have wound up with applications being concentrated in the same few […]
[…] wrote up a different list of analytic use cases back in […]