Real-Time Stock Price Forecasting Using Big Data Pipelines

Niharika Singh

Authors

Niharika Singh ABES Engineering College, Crossings Republik, Ghaziabad, Uttar Pradesh 201009 niharika250104@gmail.com Author

Keywords:

real-time forecasting, big data pipelines, limit order book, streaming analytics, LSTM, Transformer, concept drift, quantile regression, feature store, market microstructure

Abstract

Real-time stock price forecasting is no longer just a modeling problem—it is a systems problem. Predictive performance depends as much on a low-latency, fault-tolerant data pipeline as on model choice. This manuscript presents an end-to-end approach for forecasting next-interval prices (and uncertainty bands) using a streaming big-data architecture that ingests tick-level market data and exogenous signals, engineers microstructure-aware features on the fly, and serves probabilistic deep learning forecasts with millisecond latency. We unify three strands: (i) robust ingestion/processing with distributed logs and stream processors, (ii) online learning with drift-aware model updates, and (iii) risk-aware evaluation that ties forecast quality to trading utility under realistic constraints. The literature review traces the evolution from ARIMA/GARCH to LSTM/Transformer families and highlights how scalable stream processing (e.g., Kafka-like logs, Spark/Flink operators) made “always-learning” models viable. Our methodology deploys a dual-path feature stack—ultra-low-latency order-flow features and slightly slower enriched signals (options-implied volatility, news/sentiment)—merged by a temporal attention forecaster trained with quantile loss.

A walk-forward protocol with rolling re-calibration and change-point monitoring combats concept drift. Simulation research replays historical limit-order-book (LOB) streams at real time, benchmarking classical baselines (ARIMA, GBM), machine learning (XGBoost), and deep learning (LSTM, Transformer with temporal fusion). The statistical analysis shows the proposed pipeline improving RMSE/MAE by 8–15% over strong baselines while keeping p99 end-to-end latency under 80 ms on commodity cloud instances. Results illustrate that (a) microstructure features dominate sub-minute horizons, (b) probabilistic forecasts enable superior drawdown control, and (c) lightweight online fine-tuning maintains edge during volatility regimes. We conclude with deployment guidance, limitations (microstructure regime shifts, data quality, and tail events), and directions for future research in adaptive uncertainty calibration and multi-asset transfer learning.

Downloads

Download data is not yet available.

Real-Time Stock Price Forecasting Using Big Data Pipelines

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Similar Articles

MakeSubmission

Call Submission

Information

Visitors

Keywords

Similar Articles

Explainability-Driven Feature Selection for Financial Fraud Detection

Graph Analytics for Community Detection in Social Media Data

Crowd Behavior Analysis Using AI in Surveillance Video Streams

IoT-Enabled Smart Waste Management Systems Using RFID and Sensors

Fog Computing for Edge AI Workloads in Smart Transportation Systems

Semi-Supervised Learning Frameworks for Smart Campus Analytics

Hybrid AI Models for Real-Time Object Detection in Low-Bandwidth Environments

Role of Homomorphic Encryption in Privacy-Preserving Machine Learning

AI-Enabled VM Migration Strategies in Cloud Infrastructure

AI-Based Dynamic Load Balancing in Cloud Data Centers