Overview of cluster analysis and perceptual mapping and its use in pharmaceutical industry: 12104 sejal solanki
Cluster analysis is a collection of statistical methods, which identifies groups of samples that behave similarly or show similar characteristics. In common parlance it is also called look-a-like groups. The simplest mechanism is to partition the samples using measurements that capture similarity or distance between samples. In this way, clusters and groups are interchangeable words.
Typically in clustering methods, all the samples with in a cluster is considered to be equally belonging to the cluster (as against belonging with certain probability). If each observation has its unique probability of belonging to a group(cluster) and the application is interested more about these probabilities than we have to use (binomial) multinomial models.
Difference between cluster analysis and segmentation techniques:
Segmentation method could be interpreted as a collection of methods, which identifies groups of entities or statistical samples (consumers/customers, markets, organizations, which generally do not have a good application definition to classify entities in one of many groups without significant errors) that share certain common characteristics such as attitudes, purchase propensities, media habits, and lifestyle etc. The sample characteristics are used to group the samples. Grouping can be arrived at, either hierarchically partitioning the sample or non-hierarchically partitioning the samples. Thus, segmentation methods include probability-based grouping of observations and cluster (grouping) based observations. It includes hierarchical (tree based method – divisive) and non-hierarchical (agglomerative) methods. Segmentation methods are thus very general category of methodology, which includes clustering methods also.
The clustering algorithms are broadly classified into two namely hierarchical and non-hierarchical algorithms.
There is a concept of ordering involved in this approach. The ordering is driven by how many observations could be combined at a time or what determines that the distance is not statistically different from 0 between two observations or two clusters. The clusters could be arrived at either from weeding out dissimilar observations (divisive method) or joining together similar observations (agglomerative method).
It is Marketing research technique in which consumer's views about a product are traced or plotted (mapped) on a chart. Respondents are asked questions about their experience with the product in terms of its performance, packaging, price, size, etc. Theses qualitative answers are transferred to a chart (called a perceptual map) using a suitable scale (such as the Likert scale), and the results are employed in improving the product or in developing a new one. It allows senior marketing planners to take a broad view of the strengths and weaknesses of their product or service offerings relative to the strengths and weaknesses of their competition. It allows the marketing planner to view the customer and the competitor simultaneously in the same realm.
Application of Cluster Analyses in Pharmaceutical Lead Discovery
High throughput screening (HTS) programs based on diverse collections of compounds can rapidly identify leads for potential drug candidates. In cases where the compound collection is truly diverse, one may only identify a few compounds of interest. However, where a large number of hits are identified, it becomes necessary to examine the structures to determine the true number of compound classes involved so that follow-up studies may be conducted as efficiently as possible. In this case, cluster analysis is applied to determine the structural relationship among HTS hits. To efficiently expand around the region of the hit (or a class of hits) in chemical space, we have applied nearest neighbors analysis1 to select additional compounds from collections of a large number of commercial vendors, achieving an average hit rate in excess of 15%. Applying these techniques in a number of different cases, we obtained results that are useful for subsequent investigations of hits from HTS and other relevant molecular structures from the literature
(David T. Stanton, Timothy W. Morris, Siddhartha Roychoudhury, and Christian N. Parker*