ESP Journal of Engineering & Technology Advancements |
© 2022 by ESP JETA |
Volume 2 Issue 4 |
Year of Publication : 2022 |
Authors : Nishanth Reddy Mandala |
![]() |
Nishanth Reddy Mandala, 2022. "Data Integration in Heterogeneous Systems", ESP Journal of Engineering & Technology Advancements, 2(4): 148-155.
Integrating data from heterogeneous systems is a critical challenge in modern data management. The increasing diversity of data sources such as relational databases, NoSQL databases, cloud storage, and legacy systems complicates the process of unifying data for analytics, decision-making, and machine learning. This paper reviews key challenges in heterogeneous data integration and explores traditional and modern integration techniques, including ETL, data federation, and data virtualization. We also provide a comparative analysis of these approaches and propose potential solutions to address scalability, real-time access, and schema integration. Case studies and performance evaluation are presented, highlighting real-world applications in healthcare and finance.
[1] A. Silberschatz, H. F. Korth, and S. Sudarshan, Database System Concepts, 5th ed., McGraw-Hill, 2006.
[2] P. A. Bernstein and E. Rahm, ”Query Processing in Heterogeneous Systems,” ACM Computing Surveys, vol. 33, no. 1, pp. 34–60, 2003.
[3] P. Vassiliadis, A. Simitsis, and S. Skiadopoulos, ”A Survey of ETL Processes in Data Warehousing,” ACM Computing Surveys, vol. 41, no. 1, pp. 1–27, 2009.
[4] M. Lenzerini, ”Data Integration: A Theoretical Perspective,” in Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, 2002, pp 233–246.
[5] A. Datta and H. Thomas, ”Data Integration Using ETL Technology,” Journal of Database Management, vol. 16, no. 1, pp. 22–41, 2005.
[6] P. A. Bernstein, ”Middleware: A Model for Distributed System Services,” Communications of the ACM, vol. 39, no. 2, pp. 86–98, 2006.
[7] J. Z. Huang, C. X. Ling, and J. Li, ”Toward Real-Time Data Integration in Heterogeneous Environments,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1225–1241, 2009.
[8] R. Chen and M. Gertz, ”Interoperation of Heterogeneous Data Sources: A Survey of Existing Approaches,” ACM Computing Surveys, vol. 33, no. 1, pp. 29–34, 2001.
[9] R. Buyya, C. S. Yeo, and S. Venugopal, ”Cloud Computing and Emerging IT Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility,” Future Generation Computer Systems, vol. 25, no. 6, pp. 599–616, 2009.
[10] J. Dean and S. Ghemawat, ”MapReduce: Simplified Data Processing on Large Clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
[11] I. Foster, Y. Zhao, I. Raicu, and S. Lu, ”Cloud Computing and Grid Computing 360-Degree Compared,” in Grid Computing Environments Workshop, IEEE, 2008, pp. 1–10.
Data Integration, Heterogeneous Systems, ETL, Data Virtualization, Cloud Computing, Data Federation, Schema Matching.