Cong picked up his first programming language Java in high school, tried to code the games that he was playing back then and failed a bit short.
Then in his college years, he started to work on a variety of projects, but with no bullet-proof understanding of the machine underneath it, he felt that the essence of computer science was missing.
Luckily enough he got admitted to one of the best engineering schools in the states, and even luckier he survived and walked away with a master degree.
With entrepreneurial spirit, he dared himself to venture out to the wild, tumbling through start-up journeys.
Years has passed by, now Cong is the Senior Machine Learning Engineer and Tech Lead at Smule, the company specialised in developing social music-making applications like "Sing!", "Magic Piano", "Guitar!" and many more. Cong is leading number crunching with machine learning on top of Hadoop/Spark at Smule.
Methodologies on Building Fault-Resilient Machine Learning Pipelines
by Senior Machine Learning Engineer & Tech Lead, Smule
Nowadays most machine learning literatures focus on novel algorithm designs, however there are less discussing data quality control in the practical environments. To my opinion, there are two major pillars in a successful machine learning application in the industry: data extraction/transformation and model fitting. The process flows from left to right forming a conceptual assembly line, at each stage, certain quality controls are required to guarantee the next stage's inputs are well-defined. Inadequate data quality control step could lead to no conclusion and even worse faulty conclusion on the models, which might eventually results in misleading business decisions. This talk intends to introduce some good practices we've adopted that help us be more confident about the final outcomes.