Zhang Hao (4Paradigm Technology) – Parallel Computing in Machine Learning Database: A Case Study of OpenMLDB
Zhang Hao (4Paradigm Technology) – Parallel Computing in Machine Learning Database: A Case Study of OpenMLDB
Abstract
In order to boost the performance, Parallel Computing is widely used in real systems, e.g., databases, data processing systems, machine learning systems, etc. In this talk, we will look at how Parallel Computing is applied in our machine learning database – OpenMLDB (https://openmldb.ai/). OpenMLDB mainly consists of two components, the offline feature engine for training and the online feature engine for inference, which achieve high parallelism from the perspectives of both distributed cluster and multi-core architecture. In the first part of the lecture, we will discuss how the offline engine parallelizes the computing using shared-nothing architecture over a cluster of servers. And in the second part, the online engine is examined to show how to design a shared architecture to utilize the multi-core parallelism. Finally, we summarize the common techniques used to parallelize the system.
Speaker’s Profile
Dr. Zhang Hao
Senior System Architect Scientist
4Paradigm Technology
Zhang Hao is a Senior System Architect Scientist at 4Paradigm Technology. He obtained his Ph.D. degree of Computer Science at the School of Computing (SoC), NUS in 2017. He received his B.Sc. in computer science, from the Harbin Institute of Technology, in 2012. He focuses on in-memory database systems, distributed systems, feature stores, newly-emerged hardware etc. He has publications in top conferences and journals such as ICDE, VLDB, TKDE, etc. He is currently working on the open-source machine learning database – OpenMLDB.