Real-Time Stock Price Forecasting Using Big Data Pipelines

Niharika Singh

Authors

Niharika Singh ABES Engineering College, Crossings Republik, Ghaziabad, Uttar Pradesh 201009 niharika250104@gmail.com Author

Keywords:

real-time forecasting, big data pipelines, limit order book, streaming analytics, LSTM, Transformer, concept drift, quantile regression, feature store, market microstructure

Abstract

Real-time stock price forecasting is no longer just a modeling problem—it is a systems problem. Predictive performance depends as much on a low-latency, fault-tolerant data pipeline as on model choice. This manuscript presents an end-to-end approach for forecasting next-interval prices (and uncertainty bands) using a streaming big-data architecture that ingests tick-level market data and exogenous signals, engineers microstructure-aware features on the fly, and serves probabilistic deep learning forecasts with millisecond latency. We unify three strands: (i) robust ingestion/processing with distributed logs and stream processors, (ii) online learning with drift-aware model updates, and (iii) risk-aware evaluation that ties forecast quality to trading utility under realistic constraints. The literature review traces the evolution from ARIMA/GARCH to LSTM/Transformer families and highlights how scalable stream processing (e.g., Kafka-like logs, Spark/Flink operators) made “always-learning” models viable. Our methodology deploys a dual-path feature stack—ultra-low-latency order-flow features and slightly slower enriched signals (options-implied volatility, news/sentiment)—merged by a temporal attention forecaster trained with quantile loss.

A walk-forward protocol with rolling re-calibration and change-point monitoring combats concept drift. Simulation research replays historical limit-order-book (LOB) streams at real time, benchmarking classical baselines (ARIMA, GBM), machine learning (XGBoost), and deep learning (LSTM, Transformer with temporal fusion). The statistical analysis shows the proposed pipeline improving RMSE/MAE by 8–15% over strong baselines while keeping p99 end-to-end latency under 80 ms on commodity cloud instances. Results illustrate that (a) microstructure features dominate sub-minute horizons, (b) probabilistic forecasts enable superior drawdown control, and (c) lightweight online fine-tuning maintains edge during volatility regimes. We conclude with deployment guidance, limitations (microstructure regime shifts, data quality, and tail events), and directions for future research in adaptive uncertainty calibration and multi-asset transfer learning.

Downloads

Download data is not yet available.

Real-Time Stock Price Forecasting Using Big Data Pipelines

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Most read articles by the same author(s)

Similar Articles

MakeSubmission

Call Submission

Information

Visitors

Keywords

Similar Articles

Blockchain-Based Secure Voting System with Real-Time Audit Trail

AI-Based Route Optimization in Urban Public Transport Networks

ML-Based Fault Prediction in Wind Turbine Monitoring Systems

AI-Powered Vehicle Counting and Classification in Smart Cities

AI-Driven Malware Behaviour Classification and Detection Systems

Auto-Scaling Algorithms in Serverless Cloud Environments

Adaptive Threshold Algorithms for Real-Time Flood Detection Using IoT Sensors

Image Segmentation Techniques for Brain Tumor Localization

Enhanced Object Tracking Using Kalman Filters and Deep Features

Cross-Dataset Face Anti-Spoofing Using Domain Adaptation Techniques