Table Of Content
DATA
MINING
TECHNIQUES
eet
yor"
Arun K Pujari
seonrte seo a
ETE WITHIN
wy
Universities Press
smadAnt Aes FILLER. Whoa
Universes Press nia) Private
Reginve ier
BETEEUA 58 1841 Timayatonge
pera 500029 (8.75. lie
lin ance cron
2 Univerennas Poss indian Patel nated 200
Fis sasched2081
es ines 2008
ISBS-L5 97H RI TS71 ARC
SSeS.) RUPE Od
pared
Hysbrshea 1039
Pete tot se
Graphs Pritts
Hyleasad $00 013
Pati i
aiversitie Pe edit Pvt Limited
SE MMUA 26.7041 Himayamaeae
pra SHIIES CA, Ea
w
ay ronthor
Late Sd. Tatas Des
= aver:
fragile and weak tad se ws the pillar of strength Jor all her children
‘nd er husband fa team on
ConTENTS AT A GLANCE
Iwrropuerion
‘Data WAREHOUSING
Dara Minin:
Associrion RULES
Cuustenine TecHmoves
Decision Tree TecHuraues
‘OtHen TecumauEs
Wee Minis
‘Temponat ano Srarian Data MINING
List oF Contents
Fomor “i
Protoatx
Prairace
ACKRENLEDCEMENT
1 tvexopucras '
LA Iseoduetion 1
12 ata Ming as & Subject, 4
13 Gide tothis Book 5
2 Dara Wanenowsse 7
21 rwraduetion 7
22 What isa Data Warehuuse? °
23 Definition lo
2.4 Multidimensiousl Date Medet W
2S OLAP Operations u
26 Warehouse Schema “0
2.7 Dota Warehousing Arstitootore cy
2B Warchnuse Server a
29 Metwiata %
2.10 OLAP Engine 2B
BLL Data Warchouse Backend Process cf
2.12 Other Feacuss 31
213 Summary x
Exercioos 3
Bibliography 0
3 Dara Moms 2
3 lateedvetion 2
32 Whalis Date Mining? B
3.3 Data Mining: Detiutions 4
34 KDD vs. Paw Mining a
34 DBMS wsDM ”
36 Other Relued Areas st
32 DM Techniques 31
341 Other Mining Prablems 3
wii Des Mining Tenis
$9 Teaver and Caallanges in BM. 8
BAN DM Apploation Areas “0
BAL DM Applicetions—Cuve Sues a
$12 Conclusion 6
Furs Realing KG
Exercises o
Bibliowsehy co
4 Awpuauias Russ oo
4. Introduction °
42. Whatsgan Association Rule? i
4.3. Mottedste Dicenvar Assoo-ation Rules B
44° APeiur Algocithay
45 Partition Algorithm
4.6 Fincer-Ssare: Algorithm BS
4.7 Dytuie lentset Coursing Algorithm 0
44 Petree Cromth Alport, on
49. Digeus.on or iffeenl Algoritims 8
440. feorementl Auris 10
SLL. Roger Algerithn 102
4.12 Genecilized Association Rule ns
43. Assocation Rules With Hem Cone
414: Summary
Facet Reading
Cacrcises
Riliography
§ Ciustenye Teen wes
Tnlroduction
Chasen Paraligats
Paririoning Slgorithns
keMedoid Aigorshms
CLARA
CLARANS,
Teeearhica: Clustoring
DESCAN
BIRCH
CURE,
Categorical Case AlgorStuns
SUIRK
ROCK
CACTES,
515 Conchuon
Fuchor Reading
Exotics
Bitliography
Dosen Deets
6.1 Traraducsion
6.2 What isa Dovsion Te
63 Tree Construction Principle
fia Best Spht
65. Splitang indices
66 Splicing Crvera
62 Devision Tioe Constracton Aluorithrs
fd CART
6.13. Devssion Troe Construction vita resorting
6.14 Rainforest
6.13 Approeimate Methods
6.16 CLOUDS
617 BOAT
6.18 Proning Technique
6.19 ategeatin of poning and consiroction
‘Summary: An Edel Algosivhm
Otier Topics
622 Conclusion
other Reulimg
Hrcreies
Bibliography
Onuen Tecngens
TL Introduction
72. What is a Nowal Netwusk?
72 Learning ia NM
74 Unvarervisel Leaning
15. Data Mining using NN: A Case Seedy
7.6 Genetic gorithm
73 Rouph Sets
ER Suppotl Veotur Machines
79° Conclusion
on
1a
ra
1a
1
19
Ge Mining Tedsigues
ure Reading
Excrcies
Tibliegraphy
8 Wee Minin
SL Inledust:
2) WebMining
¥3- Web Content Mining
A WebStrucore Mii,
sage Mir
ke Ten Mine
Cesmuctatse Tex
83. Episorte Rule Duscovery fur Texts
K9-Hieraschy vf Catogories
8.19 TerClusering
14 Corclusive
Furker Resding
Bitioazaphy
Addendum
9 Tresrogas asp Seanad Daa MINING
9. taleedoction
92 Weat is Texapoeal Bala Mining?
9.3 Temporal Assuciatan Raise
9A Sequence Minty
93 The GSP Algorithes
6 SPADE
97 SPT
93 WUM
9.9 Epil Discovery
9.10 sent Predieson Pendle
941 Time sores Analysis
AE Spall Mince
3.13 Spanal Mimsy Tasks
944 Spatial Chute,
95. Spatia! Freud
3.16 Cypelusion
urtaer Reading
Exerises
Bitliowsephy
INU
FOREWORD
‘Vasc amouns of peratvnal Ua ae routinely coMlected and sore ave inte archives of
many orguncanins. Ta cake 8 sinple example, the ralway reservation syn haz beet
vperiona! for over a deca and a vast srivuat ul Usa f generated each day on tai
bookings. Much of this data is probably archived for audi pugpoven. "hi archived
coperatianaldsta can be efecively used fer tached! stratsie mitayennent ofthe raya.
ue insince, hy analyzing the -eservaion data wt auld he possible fd oul wae
paler wr various sectors and wae ith add or iemowe bute i crtin tas, to dtide om
the mis of varivus elases uf accommodation, ete, This, Inwever, was not Feasible und
‘oeenmy duc w limitations in beth hardware and software. Tote lt Ave yeas there ba
boon a trontendous improversea i hardware 5:2 MBE af main memory, 40 GB ask and
CPU clock vf £ 6 GAY as now avalb'e an a PC Thus, cemmputerprngeans which sil
massive arscums uf opecaional dul, eecogniza patterns and previde ints 12 fornia
Irytheses for acti aud static Ueision-making can naw be executed 2 easerable
‘me. This has aponed op file ate of rewazch ro frmulae ayn algocitms Ro
mining archival data i forrulce and Wal hypotkece. Aa uray of Has Fm computer
stience. sates and manageron scones ae beg pied in Us are, This Suh: is
cme active research aa,
1 ars extemely tuppy tbat Profeswor Aran K, Puja bat writen this book on Mack
‘Mining Techaigues Il ley adition wo the exiting comple sence iterate and
‘will be a boon to studeals Mrofeeor Pyar. has been an aclveratearchey i thi area, He
Jas als intrdooed a coorss on Dats Ming a tke University of Hydcabad and ns bora
feaching ic His rice expeconoe in teaching amt research is evident in the aeleeion of
topics and tele eatment inthis hook. The book sis wilh brie intoduction followed
bya dctafed chap on dala naccheusing,Ithas chapters whisk disease a Tange number of
sivereealgorigzms using saucalion mle, clustering sechniqus, dsicion tre techniques,
rneuual newark, genetics, rough sel, ad supper weer machines. A cel aspect ofthis
nok iss diseussion ofthe enecying are of wel ming of extual dat, A Tage Nucuber
freferences are cit and the mestmet slid and pw date, Irecazimend this Pook 2
sludens wha want io explre tis ipeisul sect
V-Rajaramen,
Manarry Professor
Super Compuler Edueaiion & Research Cer
Tadign Ise of Scien, Bangalore