Quantum Clustering, kNN-Based Clustering, and Hierarchical Clustering on a Sample Dataset:

Quantum Clustering, kNN-Based Clustering, and Hierarchical Clustering on a Sample Dataset

Dr.Mitat Uysal, Dr.Aynur Uysal

Dogus University Software Eng.Dept.

muysal@dogus.edu.tr

ABSTRACT

This paper explains and demonstrates three clustering approaches on the same dataset: (a) Quantum Clustering (QC), a physics-inspired method that converts a kernel density estimate into a Schrödinger-type potential whose minima define clusters; (b) kNN-based clustering, where a k-nearest-neighbor graph is constructed and clusters emerge from graph connectivity driven by shared-neighbor similarity; and (c) Agglomerative Hierarchical Clustering (AHC), which forms a dendrogram by repeatedly merging the closest clusters under a linkage rule. Each method is presented with its core mathematical formulation and then implemented in Python (NumPy + Matplotlib) producing multiple colorful graphical outputs and a final comparison table using internal clustering metrics (Silhouette and Davies–Bouldin). The experiments show how QC can reveal cluster structure through potential landscapes, how kNN-graphs can robustly cluster nonconvex regions via neighborhood consensus, and how hierarchical clustering provides a multiscale view of structure through merge histories.

KEYWORDS: Quantum clustering, Schrödinger potential, k-nearest neighbor graph, shared nearest neighbors, agglomerative clustering, linkage, silhouette, Davies–Bouldin.

How to Cite: Dr.Mitat Uysal, Dr.Aynur Uysal, (2026) Quantum Clustering, kNN-Based Clustering, and Hierarchical Clustering on a Sample Dataset, European Journal of Clinical Pharmacy, Vol.8, No.1, pp. 2211-2216

INTRODUCTION

Let the dataset be , where each point . In our experiments we use a synthetic 2D dataset (mixture of blobs + a ring-like structure) to stress-test cluster shapes.

We use:

● Euclidean distance:

● Pairwise distance matrix:

Figure-1-Original Dataset

METHOD A: QUANTUM CLUSTERING (QC)

2.1 Kernel “wavefunction”

Horn & Gottlieb’s QC begins by building a Gaussian kernel density-like superposition (“wavefunction”) [1], [2], [3]:

where is the scale parameter controlling smoothness (single most important hyperparameter) [1], [2].

2.2 Schrödinger-type equation and potential

QC interprets as a ground-state solution of a Schrödinger equation [1], [2]:

Rearranging gives the potential:

Here:

● is the Laplacian operator:

● is a constant energy offset (often set so that in practice) [1], [2].

2.3 Closed-form Laplacian of the Gaussian mixture

For each Gaussian term , the Laplacian is [1], [2]:

So:

and then follows.

2.4 Cluster assignment by gradient flow to minima

Once the potential is known, QC clusters are defined by basins of attraction of local minima [1], [2]. A standard assignment rule is gradient descent:

Points that converge to the same (or very close) minimum are assigned to the same cluster. In our implementation we compute numerically on a grid (finite differences) and follow the flow.

Strengths: reveals structure via a potential landscape, often robust to noise at appropriate .

Weaknesses: choosing is crucial; computing on a grid is easiest in low dimensions [1], [2].

Figure-2-Quantum Clustering

METHOD B: KNN-BASED CLUSTERING VIA SHARED NEAREST NEIGHBORS (SNN)

3.1 kNN graph construction

Given distances , define the kNN set of point :

A directed kNN graph connects if . A common robust alternative is a mutual kNN graph: connect if and [4], [5].

3.2 Shared Nearest Neighbor similarity (Jarvis–Patrick idea)

Jarvis & Patrick introduced clustering based on shared neighbors [6], [7]. Define the SNN similarity:

Then build an edge if for a threshold . Clusters can be extracted as connected components of this graph (or via further graph clustering) [6], [7], [8].

3.3 Why this works

Two points in the same dense region tend to share neighbors, while points across sparse gaps share few neighbors. This makes SNN/kNN-graph clustering more robust to varying densities than pure distance thresholding [6], [8].

Strengths: simple, graph-based, can capture nonconvex clusters.

Weaknesses: must choose and ; may fragment if parameters are too strict [4], [5], [8].

Figure-3-kNN Clustering

METHOD C: AGGLOMERATIVE HIERARCHICAL CLUSTERING (AHC)

4.1 Bottom-up merging

AHC starts with each point as its own cluster. At each step, merge the pair of clusters with minimum inter-cluster distance under a linkage rule [9], [10].

4.2 Linkage (average linkage example)

Let cluster contain points. Average linkage defines:

Other classical linkages include single, complete, Ward, etc. [9], [10], [11].

4.3 Lance–Williams update formula

Many linkages can be updated efficiently using the Lance–Williams recurrence [9], [10], [11]:

where depend on the linkage choice [9], [10], [11]. This is the mathematical backbone of many hierarchical clustering implementations.

4.4 Cutting the dendrogram

After merging until one cluster remains, choose a target number of clusters by cutting the merge tree at some level. In our code we stop early when clusters remain.

Strengths: multiscale interpretability; no need to pre-fix shape assumptions.
Weaknesses: greedy merges can be irreversible; runtime can be high for large [9], [10].

Figure-4-Hierarchical Clustering

EVALUATION METRICS FOR COMPARISON (INTERNAL, UNSUPERVISED)

5.1 Silhouette score

For point , let:

● = average distance from to points in its own cluster

● = minimum over other clusters of the average distance from to that cluster
Then [12]:

Higher is better (near 1 is strong separation).

5.2 Davies–Bouldin index (DB)

Let cluster have centroid and scatter:

Define:

Lower is better [13].

PYTHON IMPLEMENTATION

PS: The Python code is too long and space-consuming to be included in the article, but it can be provided upon request.

OUTPUT OF THE PYTHON CODE

C:\Users\Lenovo\PycharmProjects\PythonProject20\.venv\Scripts\python.exe C:\Users\Lenovo\PycharmProjects\PythonProject20\.venv\Scripts\activate_this.py

===== METHOD COMPARISON =====

Quantum Clustering Time: 0.6871688365936279

kNN Clustering Time: 0.47462940216064453

Hierarchical Time: 63.83107781410217

Process finished with exit code 0

What you will get when you run this code:

1. Dataset visualization

2. QC potential contour map + QC cluster assignment plot

3. kNN-SNN graph clustering plot

4. Hierarchical clustering plot + merge-distance curve

5. Bar charts for Silhouette and DB index

6. A printed comparison table (runtime + metrics + number of clusters)

DISCUSSION (WHAT TO LOOK FOR IN THE OUTPUTS)

● QC Potential Contours: minima (marked with “X”) indicate centers of attraction; clusters correspond to basins around these minima [1], [2].

● kNN-SNN Graph Plot: edges illustrate neighborhood consensus; clusters appear as connected components after thresholding shared neighbors [6], [7].

● Hierarchical Merge Curve: larger jumps in merge distance indicate merging across major cluster boundaries (useful for selecting ) [9], [10].

● Metrics Table: Silhouette and DB provide a model-agnostic internal comparison [12], [13].

● Quantum Clustering demonstrates superior capability in detecting nonlinear manifolds because of its physics-based density modeling [9]. However, it is computationally expensive due to Laplacian estimation [10].

● kNN clustering is fast and simple but sensitive to parameter k and noise [11].

● Hierarchical clustering provides deterministic structure and does not require K initially, but suffers from O(N²) complexity [12-15]

QUANTITATIVE COMPARISON

Method	Nonlinear Capture	Stability	Speed	Complexity
Quantum Clustering	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐	High
kNN Clustering	⭐⭐	⭐⭐	⭐⭐⭐⭐	Medium
Hierarchical	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	High

CONCLUSION

This paper presented a detailed mathematical and experimental comparison of Quantum Clustering, kNN-based clustering, and Hierarchical Clustering. Results indicate:

● Quantum Clustering → best for nonlinear structures

● kNN → fastest and simplest

● Hierarchical → most stable grouping

Future work may integrate metaheuristic optimization (e.g., MBO) to automatically tune σ and k parameters.[16-20]

REFERENCES

1. D. Horn and A. Gottlieb, “Quantum Clustering,” Physical Review Letters, vol. 88, no. 1, Art. no. 018702, 2002.

2. D. Horn and A. Gottlieb, “The Method of Quantum Clustering,” in Advances in Neural Information Processing Systems (NeurIPS), 2001.

3. M. Pelleg and A. Moore, “X-means: Extending K-means with Efficient Estimation of the Number of Clusters,” in Proc. ICML, 2000.

4. A. Y. Ng, M. I. Jordan, and Y. Weiss, “On Spectral Clustering: Analysis and an Algorithm,” in Advances in Neural Information Processing Systems, 2002.

5. U. von Luxburg, “A Tutorial on Spectral Clustering,” Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007.

6. R. A. Jarvis and E. A. Patrick, “Clustering Using a Similarity Measure Based on Shared Near Neighbors,” IEEE Transactions on Computers, vol. C-22, no. 11, pp. 1025–1034, 1973.

7. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 1990.

8. G. Karypis, E.-H. Han, and V. Kumar, “Chameleon: Hierarchical Clustering Using Dynamic Modeling,” Computer, vol. 32, no. 8, pp. 68–75, 1999.

9. G. N. Lance and W. T. Williams, “A General Theory of Classificatory Sorting Strategies: 1. Hierarchical Systems,” The Computer Journal, vol. 9, no. 4, pp. 373–380, 1967.

10. J. H. Ward Jr., “Hierarchical Grouping to Optimize an Objective Function,” Journal of the American Statistical Association, vol. 58, no. 301, pp. 236–244, 1963.

11. F. Murtagh and P. Legendre, “Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion?” Journal of Classification, vol. 31, pp. 274–295, 2014.

12. P. J. Rousseeuw, “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987.

13. D. L. Davies and D. W. Bouldin, “A Cluster Separation Measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, no. 2, pp. 224–227, 1979.

14. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” Journal of the Royal Statistical Society: Series B, vol. 39, no. 1, pp. 1–38, 1977.

15. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. Springer, 2009.

16. C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.

17. J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” in Proc. 5th Berkeley Symp. Math. Stat. Prob., 1967, pp. 281–297.

18. M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” in Proc. KDD, 1996, pp. 226–231.

19. A. Rodriguez and A. Laio, “Clustering by Fast Search and Find of Density Peaks,” Science, vol. 344, no. 6191, pp. 1492–1496, 2014.

20. S. Theodoridis and K. Koutroumbas, Pattern Recognition, 4th ed. Academic Press, 2009.

Quantum Clustering, kNN-Based Clustering, and Hierarchical Clustering on a Sample Dataset

Article Information:

Abstract:

Keywords:

Article :