Graph Analytics for Community Detection in Social Media Data

Dr SP Singh

Authors

Dr SP Singh Ex-Dean Gurukul Kangri Vishwavidyalaya, Haridwar, Uttarakhand 249404 India spsingh.gkv@gmail.com Author

Keywords:

graph analytics; community detection; social media; modularity; Leiden; Louvain; Infomap; spectral clustering; label propagation; node2vec; LFR benchmark; conductance; NMI; dynamic networks; GNNs

Abstract

Social media platforms generate massive, complex networks in which users, posts, and interactions form densely connected substructures commonly called communities. Detecting these communities enables tasks such as rumor tracking, influencer mapping, interest-based recommendation, and coordinated-behavior analysis. This manuscript presents a comprehensive, practical study of graph analytics for community detection in social media data. We synthesize foundations (graph models, modularity, conductance, and information-theoretic criteria), classical algorithms (Louvain, Leiden, Infomap, spectral clustering, label propagation), and modern embedding/GNN-based approaches. To ground the discussion, we design a simulation research protocol using the LFR benchmark to emulate social graphs with power-law degree distributions, variable community sizes, overlapping memberships, noise, and temporal drift. We also outline preprocessing steps for real platforms (retweet/reply/mention graphs; interaction weighting; bot/noise mitigation; attribute integration) that make methods reliable at scale.

Our methodology compares six approaches—Louvain, Leiden, Infomap, spectral clustering, label propagation, and node2vec+k-means—under controlled scenarios. Evaluation uses modularity (Q), Normalized Mutual Information (NMI) against ground truth (for simulations), and cut quality (conductance) alongside runtime. Statistical analysis over 10 randomized runs shows that Leiden improves modularity by ~5.4% and NMI by ~6.2% over Louvain with a small runtime overhead; Infomap yields the best conductance but is slower; label propagation remains fastest yet unstable; spectral performs strongly in quality but scales poorly; and embedding-based clustering is competitive and flexible, especially when attributes are informative. We discuss limitations (resolution limits, sensitivity to parameter choices, sampling bias, and temporal dynamics) and offer design guidelines for production pipelines—covering graph construction, algorithm selection, quality assurance, and ethical use. The study concludes with a set of actionable recommendations for deploying community detection in real social media analytics.

Graph Analytics for Community Detection in Social Media Data

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

ISSN

Visitors

Find Us at

Keywords

Call Submission

Make a Submission

Information

Browse

Language

Latest publications

Similar Articles

Multi-View Clustering Algorithms for Big Data Analytics

Sentiment Analysis of Multilingual Tweets Using Hybrid AI Models

Early Disease Prediction Using Hybrid Ensemble ML Techniques

Semi-Supervised Learning Frameworks for Smart Campus Analytics

AI-Orchestrated Microservice Security for High-Performance Scalable Systems

ML-Based Predictive Maintenance in Industrial IoT Networks

Crowd Behavior Analysis Using AI in Surveillance Video Streams

ML-Driven Credit Risk Scoring for Microfinance Lending Models

AI-Driven Malware Behaviour Classification and Detection Systems

AI-Based Intrusion Detection Systems for Software-Defined Networks