Research

My current research aims to improve user experience and protect user privacy. I build systems to detect, analyze, and measure security/privacy issues at scale. I also conduct user studies to understand users' privacy perceptions and preferences. I am fortunate to have the opportunities to work with great collaborators from other institutions such as CMU, Princeton, UC Berkeley, UCLA, etc.

Selected Publications

A complete list can be found on my Google Scholar profile.


* indicates equal contribution.

arXiv SkillBot: Identifying Risky Content for Children in Alexa Skills
Tu Le, Danny Yuxing Huang, Noah Apthorpe, Yuan Tian
Preprint on arXiv, 2021.

Abstract: Many households include children who use voice personal assistants (VPA) such as Amazon Alexa. Children benefit from the rich functionalities of VPAs and third-party apps but are also exposed to new risks in the VPA ecosystem (e.g., inappropriate content or information collection). To study the risks VPAs pose to children, we build a Natural Language Processing (NLP)-based system to automatically interact with VPA apps and analyze the resulting conversations to identify contents risky to children. We identify 28 child-directed apps with risky contents and maintain a growing dataset of 31,966 non-overlapping app behaviors collected from 3,434 Alexa apps. Our findings suggest that although voice apps designed for children are subject to more policy requirements and intensive vetting, children are still vulnerable to risky content. We then conduct a user study showing that parents are more concerned about VPA apps with inappropriate content than those that ask for personal information, but many parents are not aware that risky apps of either type exist. Finally, we identify a new threat to users of VPA apps: confounding utterances, or voice commands shared by multiple apps that may cause a user to invoke or interact with a different app than intended. We identify 4,487 confounding utterances, including 581 shared by child-directed and non-child-directed apps.


arXiv Intent Classification and Slot Filling for Privacy Policies
Wasi Uddin Ahmad*, Jianfeng Chi*, Tu Le, Thomas Norton, Yuan Tian, Kai-Wei Chang
Preprint on arXiv, 2021.

Abstract: Understanding privacy policies is crucial for users as it empowers them to learn about the information that matters to them. Sentences written in a privacy policy document explain privacy practices, and the constituent text spans convey further specific information about that practice. We refer to predicting the privacy practice explained in a sentence as intent classification and identifying the text spans sharing specific information as slot filling. In this work, we propose PolicyIE, a corpus consisting of 5,250 intent and 11,788 slot annotations spanning 31 privacy policies of websites and mobile applications. PolicyIE corpus is a challenging benchmark with limited labeled examples reflecting the cost of collecting large-scale annotations. We present two alternative neural approaches as baselines:(1) formulating intent classification and slot filling as a joint sequence tagging and (2) modeling them as a sequence-to-sequence (Seq2Seq) learning task. Experiment results show that both approaches perform comparably in intent classification, while the Seq2Seq method outperforms the sequence tagging approach in slot filling by a large margin. Error analysis reveals the deficiency of the baseline approaches, suggesting room for improvement in future works. We hope the PolicyIE corpus will stimulate future research in this domain.


VEHITS Evaluating the Dedicated Short-range Communication for Connected Vehicles against Network Security Attacks
Tu Le, Ingy Elsayed-Aly, Weizhao Jin, Seunghan Ryu, Guy Verrier, Tamjid Al Rahat, B Brian Park, Yuan Tian
In Proceedings of the 6th International Conference on Vehicle Technology and Intelligent Transport Systems - Volume 1: VEHITS, 37-44, 2020.

Abstract: According to the National Highway Traffic Safety Administration, there are more than 5 million road crashes every year in the U.S. More than 90 people die in car crashes every day. Even though the number of people surviving crashes has increased significantly thanks to safety features, such as airbags and anti-lock brakes, many people experience permanent injuries. The U.S. Department of Transportation introduced connected vehicle technologies, which enables vehicles to “talk” to each other and exchange important data on the roads, with the goal of preventing crashes from happening in the first place. With the rapid development of autonomous driving technology, vehicles in the near future will be able to operate completely without human drivers, increasing the need of reliable connected vehicle technologies. Due to the safety-critical characteristics of autonomous vehicles, it is important to evaluate the technologies extensively prior to deployment to ensure the safety of drivers, pa ssengers, and pedestrians. In this paper, we evaluate the safety of Dedicated Short-Range Communication (DSRC), which is a popular low-latency wireless communication technology specifically designed for connected vehicles. We present three real-world network security attacks and conduct experiments on real DSRC-supported modules. Our results show that DSRC is vulnerable to these dangerous attacks and such attacks can be easily implemented by adversaries without significant resources. Based on our evaluation, we also discuss potential countermeasures to better improve the security and safety of DSRC and connected vehicles.


arXiv Hardware/Software Security Patches for Internet of Trillions of Things
John A. Stankovic, Tu Le, Abdeltawab Hendawi, Yuan Tian
Preprint on arXiv, 2019.

Abstract: With the rapid development of the Internet of Things, there are many interacting devices and applications. One crucial challenge is how to provide security. Our proposal for a new direction is to create" smart buttons" and collections of them called" smart blankets" as hardware/software security patches rather than software-only patches.