LLMs cannot reliably identify and reason about security vulnerabilities (yet?): a comprehensive evaluation, framework, and benchmarks

OA Version
Citation
S. Ullah, M. Han, S. Pujar, H. Pearce, A. Coskun, G. Stringhini. 2024. "LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks" 2024 IEEE Symposium on Security and Privacy (SP), pp.862-880. https://doi.org/10.1109/sp54263.2024.00210
Abstract
Description
License