이림익(李林翼)

[firstnamelowercase]_[lastnamelowercase]@sfu.ca

캐나다 브리티시컬럼비아주 버나비 (메트로 밴쿠버)

저는 이림익(李林翼)이다. 사이먼 프레이저 대학교 컴퓨팅 과학 학부에서 조교수로 재직 중입니다. SFU에서 TAI Lab의 소장입니다.

제 연구는 신뢰할 수 있는 딥러닝에 중점을 두고 있으며, 특히 증명 가능하고 신뢰할 수 있는 딥러닝과 신뢰할 수 있는 파운데이션 모델에 초점을 맞추고 있습니다. 제 연구 분야는 머신러닝과 컴퓨터 보안을 아우르고 있습니다. 구체적으로는 다음과 같은 작업에 관심이 있습니다:

대규모 딥러닝 시스템에 대해 견고성, 공정성, 수치적 신뢰성 등의 증명 가능하고 검증 가능한 신뢰성 보장을 실현하는 것;
딥러닝 및 파운데이션 모델의 메커니즘을 이해하고 분석하며, 특히 신뢰성 문제의 근본 원인을 파악하는 것;
파운데이션 모델을 과학적이고 포괄적으로 평가하는 것.

저는 ICML, NeurIPS, ICLR, IEEE S&P, ACM CCS와 같은 주요 기계 학습 및 컴퓨터 보안 학회에서 30편 이상의 논문을 발표했습니다. 또한 데이터 과학의 라이징 스타, AdvML 라이징 스타 상, 윙 카이 청 펠로우십을 수상했습니다. 저는 2023년 제4회 국제 신경망 검증 대회(VNN-COMP’23)에서 우승한 Team \(\alpha,\beta\)-CROWN의 공동 리더를 맡았습니다. 2022년 퀄컴 이노베이션 펠로우십 및 Two Sigma PhD 펠로우십의 결승 진출자이기도 합니다.

2023년에 저는 일리노이 대학교 어바나-샴페인 캠퍼스에서 컴퓨터 과학 박사 학위를 받았으며, 지도 교수는 훌륭한 이박(李博) 교수님과 사도(謝濤) 교수님이었습니다. 2018년에는 칭화대학교 컴퓨터 과학 및 기술 학과에서 학사 학위를 받았으며, 당시 백효영(白曉穎) 교수님의 지도를 받아 웹 API 자동화 테스트에 관한 연구를 진행했습니다. 2023년부터 2024년까지 ByteDance에서 수석 연구 과학자로 근무했습니다. 또한 2022년과 2019년 마이크로소프트(Adam Kalai와 Neel Sundaresan의 지도 아래), 2021년 Fujitsu Research of America(Mukul Prasad의 지도 아래), 2017년 카네기 멜론 대학교(Matt Fredrikson의 지도 아래)에서 인턴으로 근무한 바 있습니다.

연구에 대한 자세한 내용 교육 함께할 인재를 찾고 있습니다!

선정된 논문

전체 출판 목록은 TAI Lab - Publication 및 Google Scholar에서 확인할 수 있습니다.

(*는 균등 기여를 나타냅니다)

Linyi Li, Shijie Geng, Zhenwen Li, Yibo He, Hao Yu, Ziyue Hua, Guanghan Ning, Siwei Wang, Tao Xie, Hongxia Yang
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models
38th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (NeurIPS 2024 D&B)
[Full Version] [Conference Version] [Code] [Project Website] [Slides] [BibTex]

@inproceedings{
li2024infibench,
title={InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models},
author={Linyi Li and Shijie Geng and Zhenwen Li and Yibo He and Hao Yu and Ziyue Hua and Guanghan Ning and Siwei Wang and Tao Xie and Hongxia Yang},
booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2024},
}

핵심어: LLM benchmark code

Summary A comprehensive benchmark for code large language models (LLMs) evaluating model ability on answering freeform real-world questions in the code domain. From the evaluation of over 100 models, we summarize the empirical trends and scaling laws for existing open-source code LLMs.

Linyi Li
Certifiably Trustworthy Deep Learning Systems at Scale
Doctoral Thesis
[Full Version] [Official Version] [BibTex]

@phdthesis{li2023thesis,
title = {Certifiably Trustworthy Deep Learning Systems at Scale},
author = {Linyi Li},
year = 2023,
month = {Oct},
school = {University of Illinois Urbana-Champaign},
type = {PhD thesis}
}

핵심어: certified ML

Summary My PhD thesis. The thesis systematically summarizes the current research horizon of deep learning certified trustworthiness. Compared to the SoK paper, the thesis extends beyond just robustness and covers the technical details of representative methods.

Linyi Li, Tao Xie, Bo Li
SoK: Certified Robustness for Deep Neural Networks
44th IEEE Symposium on Security and Privacy (SP 2023)
[Full Version] [Conference Version] [Slides] [Code] [Leaderboard] [BibTex]

@inproceedings{li2023sok,
author={Linyi Li and Tao Xie and Bo Li},
title = {SoK: Certified Robustness for Deep Neural Networks},
booktitle = {44th {IEEE} Symposium on Security and Privacy, {SP} 2023, San Francisco, CA, USA, 22-26 May 2023},
publisher = {{IEEE}},
year = {2023},
}

핵심어: certified ML

Summary A comprehensive systemization of knowledge on DNN certified robustness, including discussion on practical and theoretical implications, findings, main challenges, and future directions, accompanied with an open-source unified platform to evaluate 20+ representative approaches.

Linyi Li, Yuhao Zhang, Luyao Ren, Yingfei Xiong, Tao Xie
Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects
45th IEEE/ACM International Conference on Software Engineering (ICSE 2023)
[Full Version] [Conference Version] [Slides] [Code] [BibTex]

@inproceedings{li2023reliability,
author={Linyi Li and Yuhao Zhang and Luyao Ren and Yingfei Xiong and Tao Xie},
title = {Reliability Assurance for Deep Neural Network Architectures Against Numerical Defects},
booktitle = {45th International Conference on Software Engineering, {ICSE} 2023, Melbourne, Australia, 14-20 May 2023},
publisher = {{IEEE/ACM}},
year = {2023},
}

핵심어: certified ML numerical reliability

Summary An effective and efficient white-box framework for generic DNN architectures, named RANUM, for certifying numerical reliability (e.g., not output NaN or INF), generating failure-exhibiting system tests, and suggesting fixes, where RANUM is the first automated framework for the last two tasks.

Mintong Kang*, Linyi Li*, Maurice Weber, Yang Liu, Ce Zhang, Bo Li
Certifying Some Distributional Fairness with Subpopulation Decomposition
Advances in Neural Information Processing Systems (NeurIPS) 2022
[Full Version] [Conference Version] [Code] [Poster] [BibTex]

@inproceedings{kang2022certifying,
title = {Certifying Some Distributional Fairness with Subpopulation Decomposition},
author = {Mintong Kang and Linyi Li and Maurice Weber and Yang Liu and Ce Zhang and Bo Li},
booktitle = {Advances in Neural Information Processing Systems 35 (NeurIPS 2022)},
year = {2022}
}

핵심어: certified ML fairness

Summary A practical and scalable certification approach to provide fairness bound for a given model when distribution shifts from training, based on subpopulation decomposition.

Linyi Li, Jiawei Zhang, Tao Xie, Bo Li
Double Sampling Randomized Smoothing
39th International Conference on Machine Learning (ICML 2022)
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{
li2022double,
title={Double Sampling Randomized Smoothing},
author={Linyi Li and Jiawei Zhang and Tao Xie and Bo Li},
booktitle={39th International Conference on Machine Learning (ICML 2022)},
year={2022},
}

핵심어: certified ML

Summary A tighter certification approach for randomized smoothing, that for the first time circumvents the well-known curse of dimensionality under mild conditions by leveraging statistics from two strategically-chosen distributions.

Fan Wu*, Linyi Li*, Chejian Xu, Huan Zhang, Bhavya Kailkhura, Krishnaram Kenthapadi, Ding Zhao, Bo Li
COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks
10th International Conference on Learning Representations (ICLR 2022)
[Conference Version] [Full Version] [Leaderboard] [Code] [BibTex]

@inproceedings{
wu2022copa,
title={{COPA}: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks},
author={Fan Wu and Linyi Li and Chejian Xu and Huan Zhang and Bhavya Kailkhura and Krishnaram Kenthapadi and Ding Zhao and Bo Li},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=psh0oeMSBiF}
}

핵심어: certified ML deep reinforcement learning

Summary The first approach for certifying deep RL robustness against offline training dataset perturbations, i.e., poisoning attacks, by aggregating over policies trained on partitioned datasets and policies for multiple time steps.

Zhuolin Yang*, Linyi Li*, Xiaojun Xu, Bhavya Kailkhura, Tao Xie, Bo Li
On the Certified Robustness for Ensemble Models and Beyond
10th International Conference on Learning Representations (ICLR 2022)
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{
yang2022on,
title={On the Certified Robustness for Ensemble Models and Beyond},
author={Zhuolin Yang and Linyi Li and Xiaojun Xu and Bhavya Kailkhura and Tao Xie and Bo Li},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=tUa4REjGjTf}
}

핵심어: certified ML

Summary Based on a curvature bound for randomized smoothing based classifiers, we prove that large confidence margin and gradient diversity are sufficient and necessary condition for certifiably robust ensembles. By regularizing these two factors, we acheive SOTA L2 certified robustness.

Zhuolin Yang*, Linyi Li*, Xiaojun Xu*, Shiliang Zuo, Qian Chen, Pan Zhou, Benjamin I. P. Rubinstein, Ce Zhang, Bo Li
TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness
Advances in Neural Information Processing Systems (NeurIPS) 2021
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{yangli2021trs,
title = {TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness},
author = {Zhuolin Yang and Linyi Li and Xiaojun Xu and Shiliang Zuo and Qian Chen and Pan Zhou and Benjamin I. P. Rubinstein and Ce Zhang and Bo Li},
booktitle = {Advances in Neural Information Processing Systems 34 (NeurIPS 2021)},
year = {2021}
}

핵심어: robust ML

Summary We prove the guaranteed correlation between model diversity and adversarial transferabiltiy given bounded model smoothness, which leads to a strong regularizer that achieves SOTA ensemble robustness against existing strong attacks.

Jiawei Zhang*, Linyi Li*, Huichen Li, Xiaolu Zhang, Shuang Yang, Bo Li
Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation
International Conference on Machine Learning (ICML) 2021
[Conference Version] [Full Version] [Code] [Slides] [BibTex]

@inproceedings{zhangli2021progressive,
title = {Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation},
author = {Zhang, Jiawei and Li, Linyi and Li, Huichen and Zhang, Xiaolu and Yang, Shuang and Li, Bo},
booktitle = {Proceedings of the 38th International Conference on Machine Learning (ICML 2021)},
pages = {12479--12490},
year = {2021},
editor = {Meila, Marina and Zhang, Tong},
volume = {139},
series = {Proceedings of Machine Learning Research},
month = {18--24 Jul},
publisher = {PMLR},
}

핵심어: attacks for ML

Summary We systematically analyzed the gradient estimator that guides black-box attacks for DNNs, which reveals several key factors that can lead to more accurate gradient estimation with fewer queries. One way to realize these key factors is to conduct the attack with gradient estimation on a particularly scaled version of the image, which leads to the PSBA black-box attack with SOTA query effciency.

Linyi Li*, Maurice Weber*, Xiaojun Xu, Luka Rimanic, Bhavya Kailkhura, Tao Xie, Ce Zhang, Bo Li
TSS: Transformation-Specific Smoothing for Robustness Certification
ACM Conference on Computer and Communications Security (CCS) 2021
[Conference Version] [Full Version] [Code] [Slides] [BibTex]

@inproceedings{li2021tss,
title={TSS: Transformation-Specific Smoothing for Robustness Certification},
author={Linyi Li and Maurice Weber and Xiaojun Xu and Luka Rimanic and Bhavya Kailkhura and Tao Xie and Ce Zhang and Bo Li},
year={2021},
booktitle={ACM Conference on Computer and Communications Security (CCS 2021)}
}

핵심어: certified ML

Summary Natural transformations such as rotation and scaling are common in the physical world. We propose the first scalable certification approach against natural transformations based on randomzied smoothing, rigorous Lipschitz analysis, and stratified sampling. For the first time, we certify non-trivial robustness (>30% certified robust accuracy) on the large-scale ImageNet dataset.

Huichen Li*, Linyi Li*, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li
Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks
International Conference on Artificial Intelligence and Statistics (AISTATS) 2021
[Conference Version] [Full Version] [Code] [BibTex]

@inproceedings{li2020nolinear,
title={Nonlinear Gradient Estimation for Query Efficient Blackbox Attack},
author={Huichen Li and Linyi Li and Xiaojun Xu and Xiaolu Zhang and Shuang Yang and Bo Li},
year={2021},
booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS 2021)},
series = {Proceedings of Machine Learning Research},
month = {13--15 Apr},
publisher = {PMLR},
}

핵심어: attacks for ML

Summary We analyze the outcome of using nonlinear projections for black-box gradient-estimation-based attacks, which shows that proper nonlinear projections can help to improve the attack efficiency.

Linyi Li, Zhenwen Li, Weijie Zhang, Jun Zhou, Pengcheng Wang, Jing Wu, Guanghua He, Xia Zeng, Yuetang Deng, Tao Xie
Clustering Test Steps in Natural Language toward Automating Test Automation
ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) 2020, Industry Track
[Paper] [Video] [BibTex]

@inproceedings{li2020clustep,
title = {Clustering Test Steps in Natural Language toward Automating Test Automation},
author = {Li, Linyi and Li, Zhenwen and Zhang, Weijie and Zhou, Jun and Wang, Pengcheng and Wu, Jing and He, Guanghua and Zeng, Xia and Deng, Yuetang and Xie, Tao},
booktitle = {Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering {(ESEC/FSE 2020)}},
year = {2020},
doi = {10.1145/3368089.3417067},
url = {https://doi.org/10.1145/3368089.3417067}
}

핵심어: ML for software testing

Summary We provide an effective pipeline to cluster test steps in natural language and then synthesize executable test cases, deployed for WeChat testing.

Linyi Li*, Zexuan Zhong*, Bo Li, Tao Xie
Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space
International Joint Conference on Artificial Intelligence (IJCAI) 2019
[Paper] [Code] [BibTex]

@inproceedings{li2019robustra,
title = {Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space},
author = {Li, Linyi and Zhong, Zexuan and Li, Bo and Xie, Tao},
booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI 2019)},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
pages = {4711--4717},
year = {2019},
month = {7},
doi = {10.24963/ijcai.2019/654},
url = {https://doi.org/10.24963/ijcai.2019/654}
}

핵심어: certified ML

Summary We propose a training method for achieving certified robustness by regularizing only within the reference adversarial space from a jointly trained model to alleviate the optimization hardness and achieve higher certified robustness.

기타

저는 여행, 지리, 언어, 특히 중국 음운학을 사랑합니다. 저는 조원임 (趙元任)을 존경합니다.
가끔 재미로 프로그래밍 대회에 참가합니다.
저는 아주아주 매운 🌶 음식을 정말 좋아합니다 :)
저는 중국 장가계(장자제)에서 태어나 어린 시절을 보냈습니다. 대학교에 입학하기 전에는 중국 장사(창사)에서 살았습니다.
저는 북부 토가족(투자족) 출신입니다. 토가족 언어로는 Ngaf Bifzivkar라고 합니다.
2022-2023 주기로 교수직 시장에 있었습니다. 여기에서 제 연구, 교육 및 다양성에 관한 성명을 확인할 수 있습니다.