Previous project

Named entity recognition for Fintech
This paper is for the Shanghaitech CS181 Artificial Intelligence in 24Spring final project. We apply the Hidden Markov Model (HMM) we learned in our course to train our NER model for Fintech, which can extract named entities from financial news. However, HMM assumes that each word in the observed sentence is independent of the others, which is flawed in solving the problem of named entity recognition. Therefore, we try the Conditional Random Field (CRF) to utilize the dependency between the words in financial news.

Predicting the interactions of Weibo
This research presents a comprehensive analysis and predictive modeling framework for social media engagement on Sina Weibo, China’s largest microblogging platform. The study focuses on developing an accurate prediction model for post engagement metrics, including forwards, comments, and likes, within 24 hours of publication. We propose a systematic approach to feature engineering that encompasses three key dimensions: content characteristics, temporal patterns, and user behavior profiles.

Fine-grained Classification of A Million Life Trajectories from Wikipedia by Fusing LLM-Refined Syntactic Graphs
Life trajectories of notable people convey essential messages for the wide work on human dynamics. These trajectories consist of person, time, location, activity type tuples, and may record when and where a person was born, passed away, went to school, started a job, got married, won an election, made a scientific discovery, finished a masterpiece, came up with an invention, became a champion, fought in a war, or signed a treaty. Adopting a tool that extracts (\textit{person, time, location}) triples from Wikipedia, we formulate a problem of classifying the triples into 24 carefully-defined types, given the triples’ textual context as complementary information.