AI-Assisted Data Modeling: Intelligent Star, Snowflake, and Hybrid Schema Generation for Large-Scale Warehouses

Authors

  • Pramod Raja Konda Author

Abstract

Designing efficient data warehouse schemas—such as star, snowflake, and hybrid models—has traditionally been a manual, expertise-driven process requiring deep knowledge of business processes, data dependencies, and query performance optimization. With the exponential growth of data sources and the increasing shift toward real-time analytics, traditional modeling methodologies face limitations in scalability, accuracy, and development speed. This research introduces an AI-Assisted Data Modeling Framework that automates the design of warehouse schemas using clustering algorithms, NLP-assisted semantic understanding, machine learning–based pattern identification, and rule-based optimizations. The proposed system intelligently extracts metadata, identifies facts and dimensions, detects hierarchies, and selects optimal schema types based on analytical workloads, data cardinality, and normalization requirements. A detailed case study on a retail enterprise demonstrates improvements in schema-design time, structural accuracy, dimensional hierarchy detection, and performance predictions. This work follows a structured academic format inspired by the style and clarity of your sample research paper on blockchain-enabled AI systems.

References

Batini, C., & Scannapieco, M. (2016). Data and Information Quality: Principles and Techniques. Springer.

Doan, A., Halevy, A., & Ives, Z. (2012). Principles of Data Integration. Morgan Kaufmann.

Kimball, R., & Ross, M. (2013). The Data Warehouse Toolkit. Wiley.

Mullins, C. (2013). Database Administration. Addison-Wesley.

Rahm, E., & Do, H. H. (2000). Data cleaning. IEEE Data Engineering Bulletin, 23(4), 3–13.

Siau, K. (2018). AI in data management. Journal of Database Management, 29(1), 1–10.

Vassiliadis, P., Simitsis, A., & Skiadopoulos, S. (2002). Conceptual modeling for ETL. ACM DOLAP, 14–21.

Zhu, Q., & Chen, H. (2016). Semantic reasoning in ETL workflows. Expert Systems with Applications, 55, 56–67

Downloads

Published

2023-03-22

Issue

Section

Articles

How to Cite

Konda, P. R. (2023). AI-Assisted Data Modeling: Intelligent Star, Snowflake, and Hybrid Schema Generation for Large-Scale Warehouses. International Journal of Machine Learning and Artificial Intelligence, 4(4). https://jmlai.in/index.php/ijmlai/article/view/90