TARI WORLD Big Data
Last updated
Last updated
TARI WORLD’s big data uses Hadoop framework, and supports low-cost, fast big data analysis features. TARI framework, which serves as the practical standard among platforms for big data processing and analysis, is segmented by each feature. Among these, Hadoop Distributed File System (HDFS) and MapReduce framework are the core features. The implementation of MapReduce is divided into Map and Reduce phases through which data and results to be searched are collected and organized. First, during the Map phase, raw data is grouped as Key-Value. Then, these Key-Value pairs are analyzed to extract information on the order and value of the desired data.
In the Reduce phase, data is filtered and arrayed with the Key from Map phase. By implementing these functions, the roles of MapReduce can be controlled and managed. The principle behind HDFS is that while it is the same as the existing distributed processing system in that multiple machines process data, it improved the existing distributed processing system by providing a solution to the issue of computer breakdown causing problems to the interface and making it difficult to program.
HDFS, one of the components of TARI’s big data, processes identical data in a distributed manner and saves three copies. Assuming the structure similar to that of load balancing, which is a solution for data bottleneck, it is the framework which provides features and structure of common need to support linkage between TARI reference architecture system development guideline and other systems, which the logic tier is composed as follows.