Table Of ContentJacekKoronacki,ZbigniewW.Ras´,S(cid:225)awomirT.Wierzchon´,andJanuszKacprzyk(Eds.)
AdvancesinMachineLearningI
StudiesinComputationalIntelligence,Volume262
Editor-in-Chief
Prof.JanuszKacprzyk
SystemsResearchInstitute
PolishAcademyofSciences
ul.Newelska6
01-447Warsaw
Poland
E-mail:[email protected]
Furthervolumesofthisseriescanbefoundonourhomepage:
springer.com Vol.253.RogerLeeandNaohiroIshii(Eds.)
SoftwareEngineeringResearch,ManagementandApplications
Vol.242.CarlosArtemioCoelloCoello, 2009,2009
SatchidanandaDehuri,andSusmitaGhosh(Eds.) ISBN978-3-642-05440-2
SwarmIntelligenceforMulti-objectiveProblemsinData
Mining,2009 Vol.254.KyandoghereKyamakya,WolfgangA.Halang,
ISBN978-3-642-03624-8 HerwigUnger,JeanChamberlainChedjou,
Vol.243.ImreJ.Rudas,Ja´nosFodor,and NikolaiF.Rulkov,andZhongLi(Eds.)
JanuszKacprzyk(Eds.) RecentAdvancesinNonlinearDynamicsandSynchronization,
TowardsIntelligentEngineeringandInformationTechnology, 2009
2009 ISBN978-3-642-04226-3
ISBN978-3-642-03736-8
Vol.255.CatarinaSilvaandBernardeteRibeiro
Vol.244.NgocThanhNguyen,Rados lawPiotrKatarzyniak, InductiveInferenceforLargeScaleTextClassification,2009
andAdamJaniak(Eds.) ISBN978-3-642-04532-5
NewChallengesinComputationalCollectiveIntelligence,2009
ISBN978-3-642-03957-7 Vol.256.PatriciaMelin,JanuszKacprzyk,and
Vol.245.OlegOkunandGiorgioValentini(Eds.) WitoldPedrycz(Eds.)
ApplicationsofSupervisedandUnsupervisedEnsemble Bio-inspiredHybridIntelligentSystemsforImageAnalysisand
Methods,2009 PatternRecognition,2009
ISBN978-3-642-03998-0 ISBN978-3-642-04515-8
Vol.246.ThanasisDaradoumis,SantiCaballe´,
Vol.257.OscarCastillo,WitoldPedrycz,and
JoanManuelMarque`s,andFatosXhafa(Eds.)
JanuszKacprzyk(Eds.)
IntelligentCollaborativee-LearningSystemsandApplications,
EvolutionaryDesignofIntelligentSystemsinModeling,
2009
SimulationandControl,2009
ISBN978-3-642-04000-9
ISBN978-3-642-04513-4
Vol.247.MonicaBianchini,MarcoMaggini,FrancoScarselli,
andLakhmiC.Jain(Eds.) Vol.258.LeonardoFranco,DavidA.Elizondo,and
InnovationsinNeuralInformationParadigmsandApplications, Jose´M.Jerez(Eds.)
2009 ConstructiveNeuralNetworks,2009
ISBN978-3-642-04002-3 ISBN978-3-642-04511-0
Vol.248.CheePengLim,LakhmiC.Jain,and
SatchidanandaDehuri(Eds.) Vol.259.KasthuriranganGopalakrishnan,HalilCeylan,and
InnovationsinSwarmIntelligence,2009 NiiO.Attoh-Okine(Eds.)
ISBN978-3-642-04224-9 IntelligentandSoftComputinginInfrastructureSystems
Engineering,2009
Vol.249.WesamAshourBarbakh,YingWu,andColinFyfe ISBN978-3-642-04585-1
Non-StandardParameterAdaptationforExploratoryData
Analysis,2009 Vol.260.EdwardSzczerbickiandNgocThanhNguyen(Eds.)
ISBN978-3-642-04004-7 SmartInformationandKnowledgeManagement,2009
Vol.250.RaymondChiongandSandeepDhakal(Eds.) ISBN978-3-642-04583-7
NaturalIntelligenceforScheduling,PlanningandPacking
Problems,2009 Vol.261.NadiaNedjah,LeandrodosSantosCoelho,and
ISBN978-3-642-04038-2 LuizadeMacedodeMourelle(Eds.)
Multi-ObjectiveSwarmIntelligentSystems,2009
Vol.251.ZbigniewW.RasandWilliamRibarsky(Eds.)
ISBN978-3-642-05164-7
AdvancesinInformationandIntelligentSystems,2009
ISBN978-3-642-04140-2
Vol.262.JacekKoronacki,ZbigniewW.Ra´s,
Vol.252.NgocThanhNguyenandEdwardSzczerbicki(Eds.) S(cid:225)awomirT.Wierzchon´,andJanuszKacprzyk(Eds.)
IntelligentSystemsforKnowledgeManagement,2009 AdvancesinMachineLearningI,2010
ISBN978-3-642-04169-3 ISBN978-3-642-05176-0
Jacek Koronacki,ZbigniewW.Ra´s,
S(cid:225)awomir T.Wierzchon´,and Janusz Kacprzyk (Eds.)
Advances in Machine Learning I
Dedicated to the Memory of Professor
Ryszard S.Michalski
123
JacekKoronacki S(cid:225)awomirT.Wierzchon´
InstituteofComputerScience InstituteofComputerScience
PolishAcademyofSciences PolishAcademyofSciences
ul.Ordona21 ul.Ordona21
01-237Warsaw 01-237Warsaw
Poland Poland
E-mail:[email protected] E-mail:[email protected]
ZbigniewW.Ra´s ProfessorJanuszKacprzyk
WoodwardHall430C SystemsResearchInstitute
UniversityofNorthCarolina PolishAcademyofSciences
9201UniversityCityBlvd. ul.Newelska6
Charlotte,N.C.28223 01-447Warsaw
USA Poland
E-mail:[email protected]@pjwstk.edu.pl E-mail:[email protected]
ISBN 978-3-642-05176-0 e-ISBN 978-3-642-05177-7
DOI 10.1007/978-3-642-05177-7
Studiesin Computational Intelligence ISSN1860-949X
Library of Congress Control Number:2009940321
(cid:2)c 2010 Springer-Verlag Berlin Heidelberg
This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilm or in any other way,
andstorageindatabanks.Duplicationofthispublicationorpartsthereofispermitted
only under the provisions of the German Copyright Law of September 9, 1965, in
its current version, and permission for use must always be obtained from Springer.
Violations are liable to prosecution undertheGerman Copyright Law.
Theuseofgeneral descriptivenames,registered names,trademarks,etc.inthispubli-
cationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesare
exemptfromtherelevantprotectivelawsandregulationsandthereforefreeforgeneral
use.
Typeset&CoverDesign:ScientificPublishing Services Pvt. Ltd., Chennai, India.
Printed in acid-free paper
9 8 7 6 5 4 3 2 1
springer.com
Foreword
Professor Richard S. Michalski passed away on September 20, 2007. Once we learned
about his untimely death we immediately realized that we would no longer have with
us a truly exceptional scholar and researcher who for several decades had been influ-
encing the work of numerous scientists all over the world - not only in his area of
expertise, notably machine learning, but also in the broadly understood areas of data
analysis, data mining, knowledge discovery and many others. In fact, his influence
was even much broader due to his creative vision, integrity, scientific excellence and
exceptionally wide intellectual horizons which extended to history, political science
and arts.
Professor Michalski’s death was a particularly deep loss to the whole Polish scien-
tific community and the Polish Academy of Sciences in particular. After graduation,
he began his research career at the Institute of Automatic Control, Polish Academy of
Science in Warsaw. In 1970 he left his native country and hold various prestigious
positions at top US universities. His research gained impetus and he soon established
himself as a world authority in his areas of interest – notably, he was widely consid-
ered a father of machine learning.
His contacts with the Polish scientific community were very close over all the
years; in the last couple of years he was an affiliate scientist at the Institute of Com-
puter Science, Polish Academy of Sciences, Warsaw. This relation culminated some
years ago with his election to the rank of Foreign Member of the Polish Academy of
Sciences, a distinction granted to only a small number of world-wide best scientists,
including numerous Nobel Prize and other prestigious awards winners.
Professor Michalski was one of those active members of the Polish Academy of
Sciences who were always interested in solving whatever problems we had, always
ready to help us in shaping the research policy of the Academy and discuss with us all
difficult issues that are these days unavoidable in any large and prestigious research
organization with so many strong links to science worldwide. He was always ready to
offer us his deep understanding and scholarly vision of the future of the human scien-
tific endeavor. As President of the Polish Academy of Sciences I sense very person-
ally an enormous loss coming from no longer being able to ask for his opinion and
advice.
I wish to congratulate the editors of these scholarly volumes, Professors Jacek
Koronacki, Zbigniew Ra(cid:286), Sławomir T. Wierzcho(cid:276) and Janusz Kacprzyk, for their
initiative to pay the tribute to the memory of Professor Michalski. Having known him
for many years they realized that the best way to honor his life achievements would
be to prepare a collection of high quality papers on topics broadly perceived as Pro-
fessor Michalski’s main interest and to present in memoriam volumes of the contribu-
tions written by those who had the luck to be his friends or, at least, to meet him on
various occasions. I am really impressed that so many prominent authors have ac-
cepted the invitation and I thank all of them most deeply.
VI Foreword
I believe the memory of Professor Richard S. Michalski should remain with us for
ever. Hopefully, these volumes will contribute to reaching this objective in the most
appropriate and substantial way.
Professor Michał Kleiber
President
Polish Academy of Sciences
Preface
This is the first volume of a large two-volume editorial project we wish to dedicate to
the memory of the late Professor Ryszard S. Michalski who passed away in 2007. He
was one of the fathers of machine learning, an exciting and relevant, both from the
practical and theoretical points of view, area in modern computer science and infor-
mation technology. His research career started in the mid-1960s in Poland, in the
Institute of Automation, Polish Academy of Sciences in Warsaw, Poland. He left for
the USA in 1970, and since then had worked there at various universities, notably, at
the University of Illinois at Urbana – Champaign and finally, until his untimely death,
at George Mason University. We, the editors, had been lucky to be able to meet and
collaborate with Ryszard for years, indeed some of us knew him when he was still in
Poland. After he started working in the USA, he was a frequent visitor to Poland,
taking part at many conferences until his death. We had also witnessed with a great
personal pleasure honors and awards he had received over the years, notably when
some years ago he was elected Foreign Member of the Polish Academy of Sciences
among some top scientists and scholars from all over the world, including Nobel prize
winners.
Professor Michalski’s research results influenced very strongly the development of
machine learning, data mining, and related areas. Also, he inspired many established
and younger scholars and scientists all over the world.
We feel very happy that so many top scientists from all over the world agreed to
pay the last tribute to Professor Michalski by writing papers in their areas of research.
These papers will constitute the most appropriate tribute to Professor Michalski, a
devoted scholar and researcher. Moreover, we believe that they will inspire many
newcomers and younger researchers in the area of broadly perceived machine learn-
ing, data analysis and data mining.
The papers included in the two volumes, Machine Learning I and Machine Learn-
ing II, cover diverse topics, and various aspects of the fields involved. For conven-
ience of the potential readers, we will now briefly summarize the contents of the par-
ticular chapters.
Part I, “Introductory Chapters”, contains – first of all a more general chapter which
presents the most notable research interests and accomplishments of Professor Rich-
ard S. Michalski, and his inspiration and impact of the field of machine learning, data
mining, and related areas. The other chapters may be viewed as being directly influ-
enced by Professor Michalski’s ideas, and employ tools and techniques that may be
viewed as straightforward extensions and generalizations of his works.
• Janusz Wojtusiak and Kenneth A. Kaufman (“Ryszard S. Michalski: The
Vision and Evolution of Machine Learning”) discuss some basic elements and
aspects of the vision and contributions of Ryszard S. Michalski who has pio-
neered so many areas and methods of machine learning. The authors offer a
VIII Preface
brief summary of what they believe are the most important aspects of Professor
Michalski’s research, and present the vision of machine learning that he com-
municated to them personally on multiple occasions. The most important topics
mentioned in this chapter are: natural induction, knowledge mining, AQ learn-
ing, conceptual clustering, VL1 and attributional calculus, constructive induc-
tion, the learnable evolution model, inductive databases, methods of plausible
reasoning, and the inferential theory of learning.
• Marcus A. Maloof (“The AQ Methods for Concept Drift”) deals with the con-
cept drift which occurs when the target concept that a learner must acquire
changes over time. It is present in applications involving user preferences (e.g.,
calendar scheduling) and adversaries (e.g., spam detection). The author first
mentions his earlier works, based on Michalski’s AQ algorithm, and more re-
cent ones based on ensemble methods, as well as some implementations of sev-
eral methods that other researchers have proposed. In the chapter, the author
provides a survey of his results obtained since the mid-1990s using the Stagger
concepts and learning methods for concept drift. The author’s methods based
on the AQ algorithm and the ensemble methods, and the methods of other re-
searchers are also examined. It is shown that, for the Stagger concepts, the dy-
namic weighted majority with an incremental algorithm for producing decision
trees as the base learner and the systems based on the AQ11 algorithm (notably
an AQ11 system with partial instance memory and Widmer and Kubat’s win-
dow adjustment heuristic) achieve the best performance.
• Krzysztof J. Cios and Łukasz A. Kurgan (“Machine Learning Algorithms In-
spired by the Work of Richard Spencer Michalski”) first define the field of in-
ductive machine learning and then describe Michalski’s basic AQ algorithm.
Next, the authors’ two machine learning algorithms, the CLIP4: a hybrid of rule
and decision tree algorithms, and the DataSqeezer: a rule algorithm, are dis-
cussed. The development of the latter two algorithms was inspired to a large
degree by Michalski’s seminal paper on inductive machine learning from1969
which was one of first attempts devise inductive machine learning algorithms
that generate rules.
• Janusz Kacprzyk and Gra(cid:298)yna Szkatuła (“Inductive Learning: A Combina-
torial Optimization Approach”) propose an improved inductive learning
method to derive classification rules correctly describing (at least) most of the
positive examples and do not correctly describe (at least) most of the negative
examples. First, a pre-analysis of data is performed to assign higher weights to
those values of attributes which occur more often in the positive than in the
negative examples. The inductive learning problem is represented as a modifi-
cation of the set covering problem which is solved by an integer programming
based algorithm using elements of a greedy algorithm or a genetic algorithm,
for efficiency. The results are very encouraging and are illustrated on a thyroid
cancer data set.
Part II, “General Issues”, contains contributions which deal with more general issues
related to machine learning, data mining, knowledge discovery, and other topics rele-
vant to the volumes and to be considered in the constituting contributions.
Preface IX
• Pinar Donmez and Jaime G. Carbonell (“From Active to Proactive Learning
Methods”) consider the situation, which is common in machine learning, that
unlabeled data abounds, but expert-generated labels are scarce. Therefore, the
question is how to select the most informative instances to label, as some la-
beled examples can be much more useful than others in training classifiers to
minimize errors. Active learning is the process of selecting the best examples to
label in order to improve classifier performance, where "best" is typically de-
fined as minimizing the loss function of the classifier or otherwise directly
minimizing its estimated classification error. Alternatives to active learning at-
tempt to use unlabeled data directly in helping to train classifiers, without ob-
taining additional labels. These efforts range from transductive support vector
machines to co-training and other semi-supervised methods, but they cannot in
general rival active learning. Combining ideas from both semi-supervised and
active learning remains a largely unexplored area for future research.
• Nada Lavra(cid:254), Johannes Fürnkranz, and Dragan Gamberger (“Explicit Fea-
ture Construction and Manipulation for Covering Rule Learning Algorithms”)
deal with features as the main rule building blocks of rule learning algorithms. In
contrast to a common practice in classification rule learning, it is argued in the
chapter that the separation of feature construction and rule construction processes
has a theoretical and practical justification. An explicit usage of features enables
a unifying framework of both propositional and relational rule learning and pro-
cedures for feature construction in both types of domains are presented and ana-
lyzed. It is demonstrated that the presented procedure for constructing a set of
simple features has the property that the resulting feature set enables the construc-
tion of complete and consistent rules whenever possible, and that the set does not
include obviously irrelevant features. It is also shown that feature relevancy may
improve the effectiveness of rule learning. The concept of relevancy in the cover-
age space is illustrated, and it is shown that the transformation from the attribute
to the feature space enables a novel, theoretically justified way of handling un-
known attribute values. The same approach makes it possible that the estimated
imprecision of continuous attributes can be taken into account, resulting in the
construction of features that are robust to attribute imprecision.
• Lisa Torrey, Jude Shavlik, Trevor Walker, and Richard Maclin (“Transfer
Learning via Advice Taking”) describe a transfer method in which a reinforce-
ment learner analyzes its experience in the source task and learns rules to use as
an advice in the target task. The rules, which are learned via inductive logic
programming, describe the conditions under which an action is successful in
the source task. The advice taking algorithm used in the target task allows a re-
inforcement learner to benefit from rules even if they are imperfect. A map-
ping, which is provided by the human being, describes the alignment between
the source and target tasks, and may also include advice about the differences
between them. Using three tasks in the RoboCup simulated soccer domain, the
authors demonstrate that this transfer method can speed up reinforcement learn-
ing substantially.
X Preface
Part III, “Classification and Beyond”, deals with many aspects, methods, tools and
techniques related to broadly perceived classification which is a key issue in many
areas, notably those related to the topics of the present volume.
• Pavel Brazdil and Rui Leite (“Determining the Best Classification Algorithm
with Recourse to Sampling and Metalearning”) discuss the determination of the
best classification algorithm which is the best one for a given task. A method is
described which relies on relatively fast pairwise comparisons involving two al-
gorithms. This method is based on a previous work and exploits sampling land-
marks, that is, information about learning curves besides classical data character-
istics. One key feature of this method is an iterative procedure for extending the
series of experiments used to gather new information in the form of sampling
landmarks. Metalearning plays also a vital role. The comparisons between vari-
ous pairs of algorithms are repeated and the result is represented in the form of a
partially ordered ranking. Evaluation of the approach is done by comparing the
partial order of algorithms that has been predicted to the partial order representing
the supposedly correct result. The results of the evaluation show that the method
has good performance and could be of help in practical applications.
• Michelangelo Ceci, Annalisa Appice, and Donato Malerba (“Transductive
Learning for Spatial Data Classification”) are concerned with learning classifiers
to classify spatial data. Accordingly, they address such issues as: heterogeneity of
spatial objects, an implicit definition of spatial relationships among objects, spa-
tial autocorrelation and the abundance of unlabelled data which potentially con-
vey a large amount of information. The first three issues are due to the inherent
structure of spatial units of analysis which can be easily accommodated if a
(multi-)relational data mining approach is considered. The fourth issue demands
the adoption of a transductive setting which aims at making predictions for a
given set of unlabelled data. Transduction is also motivated by the contiguity of
the concept of positive autocorrelation which typically affects spatial phenomena,
with the smoothness assumption which characterizes the transductive setting. The
authors investigate a relational approach to spatial classification in a trans
ductive setting. Computational solutions to the main difficulties met in this ap-
proach are presented. In particular, a relational upgrade of the naïve Bayes classi-
fier is proposed as a discriminative model, an iterative algorithm is designed for
the transductive classification of unlabelled data, and a distance measure between
relational descriptions of spatial objects is defined in order to determine the k-
nearest neighbors of each example in the dataset. Computational solutions have
been tested on two real-world spatial datasets. The transformation of spatial data
into a multi-relational representation, and experimental results, are reported and
commented.
• Krzysztof Dembczy(cid:276)ski, Wojciech Kotłowski, and Roman Słowi(cid:276)ski (“Be-
yond Sequential Covering – Boosted Decision Rules”) consider boosting, or
forward stagewise additive modeling, a general induction procedure which ap-
pears to be particularly efficient in binary classification and regression tasks.
When boosting is applied to induction of decision rules, it can be treated as
generalization of sequential covering because it approximates the solution of
the prediction task by sequentially adding new rules to the ensemble without