?
Clustering of Biomedical Data Using the Greedy Clustering Algorithm Based on Interval Pattern Concepts
nterval pattern concepts are a particular case of patternstructures. They can be used to clusterize rows of a numerical formalcontext (data matrix): two rows are close to each other if their entriesat the corresponding positions fall within a given interval.The problem of mining interval pattern concepts has much in commonwith the known problem related to computational geometry: given afinite set of points in the Euclidean space, position a box of a given sizein such a way that it encloses as many points as possible. This problemand its variations have been thoroughly studied in the case of a plane;however, the authors are not aware of the existence of algorithms which ina reasonable time produce an exact solution in the space of an arbitrarydimension.There exists an approximate greedy algorithm for solving this problem.It produces a solution with time which is linear in the number of pointsand polynomial in dimension. We apply a clustering approach based onthat algorithm to the gene expression table from the dataset “The CancerCell Line Encyclopedia”. The resulting partition well agrees witha prioriknown biological factors.