Today, we are living in a data-driven society where decisions are increasingly driven by the insights gathered from data analytics (data is the new oil). This transformation is accelerated in a large part due to technological breakthroughs in machine learning (ML).
The database community developed relational database technologies that revolutionized SQL-based data analytics, and is now playing a central role in democratizing ML-based data analytics by exploiting our expertise in all aspects of data management such as data preparation and query optimization. Conversely, state-of-the-art ML technologies can also be leveraged to advance solutions for challenging data management problems such as data cleaning and data integration.
My research explores various synergies between the broad fields of databases (DB) and machine learning (ML). Specifically, I focus on (1) “ML for DB”, i.e., how to leverage advanced theoretical and algorithmic ML techniques to solve hard and practical DB problems, such as large-scale data cleaning and integration; and (2) “DB for ML”, i.e., how to build data management systems to tackle the common pain points in practical ML, such as the lack of high-quality labeled data.