Graph Analytics for Community Detection in Social Media Data

Authors

  • Dr SP Singh Ex-Dean Gurukul Kangri Vishwavidyalaya, Haridwar, Uttarakhand 249404 India spsingh.gkv@gmail.com Author

Keywords:

graph analytics; community detection; social media; modularity; Leiden; Louvain; Infomap; spectral clustering; label propagation; node2vec; LFR benchmark; conductance; NMI; dynamic networks; GNNs

Abstract

Social media platforms generate massive, complex networks in which users, posts, and interactions form densely connected substructures commonly called communities. Detecting these communities enables tasks such as rumor tracking, influencer mapping, interest-based recommendation, and coordinated-behavior analysis. This manuscript presents a comprehensive, practical study of graph analytics for community detection in social media data. We synthesize foundations (graph models, modularity, conductance, and information-theoretic criteria), classical algorithms (Louvain, Leiden, Infomap, spectral clustering, label propagation), and modern embedding/GNN-based approaches. To ground the discussion, we design a simulation research protocol using the LFR benchmark to emulate social graphs with power-law degree distributions, variable community sizes, overlapping memberships, noise, and temporal drift. We also outline preprocessing steps for real platforms (retweet/reply/mention graphs; interaction weighting; bot/noise mitigation; attribute integration) that make methods reliable at scale.

Our methodology compares six approaches—Louvain, Leiden, Infomap, spectral clustering, label propagation, and node2vec+k-means—under controlled scenarios. Evaluation uses modularity (Q), Normalized Mutual Information (NMI) against ground truth (for simulations), and cut quality (conductance) alongside runtime. Statistical analysis over 10 randomized runs shows that Leiden improves modularity by ~5.4% and NMI by ~6.2% over Louvain with a small runtime overhead; Infomap yields the best conductance but is slower; label propagation remains fastest yet unstable; spectral performs strongly in quality but scales poorly; and embedding-based clustering is competitive and flexible, especially when attributes are informative. We discuss limitations (resolution limits, sensitivity to parameter choices, sampling bias, and temporal dynamics) and offer design guidelines for production pipelines—covering graph construction, algorithm selection, quality assurance, and ethical use. The study concludes with a set of actionable recommendations for deploying community detection in real social media analytics.

Downloads

Download data is not yet available.

Published

2026-02-03

How to Cite

Singh, Dr SP. “Graph Analytics for Community Detection in Social Media Data”. International Journal of Advanced Research in Computer Science and Engineering (IJARCSE) 2, no. 1 (February 3, 2026): Feb (1–11). Accessed February 5, 2026. https://ijarcse.org/index.php/ijarcse/article/view/106.

Similar Articles

1-10 of 58

You may also start an advanced similarity search for this article.