BinaryAudit / gpt-5.4

OpenAI openai 38% pass rate Rank #20 of 26 Proprietary 5 Mar 2026

Total Runs

99

Tasks Tested

33

Total Cost

$60.80

Avg Duration

6.0m

Performance by Task

Task Pass Rate Runs Avg Cost Avg Time
dnsmasq-backdoor-detect-negative 100%
$0.12 1m
dnsmasq-backdoor-detect-negative2 100%
$0.12 2m
dropbear-brokenauth-detect-negative 100%
$0.15 1m
GHIDRA_FAV150 ghidra-decompile-pyghidra 100%
$0.20 2m
GHIDRA_FAV150 ghidra-decompile-vanilla 100%
$0.26 2m
lighttpd-backdoor-detect-negative 100%
$0.07 1m
lighttpd-backdoor-detect-negative2 100%
$0.08 1m
radare2-decompile 100%
$0.05 1m
radare2-decompile-jq 100%
$0.08 1m
sozu-backdoor-detect-negative 100%
$0.06 1m
sozu-backdoor-detect-negative2 100%
$0.08 1m
dropbear-brokenauth-detect-negative2 67%
$0.16 2m
GHIDRA_FAV150 ghidra-decompile-pyghidra-jq 33%
$0.56 10m
GHIDRA_FAV150 ghidra-decompile-vanilla-jq 33%
$0.82 13m
lighttpd-timebomb-multiple-binaries-detect 33%
$0.44 4m
dnsmasq-backdoor-detect 0%
$0.14 1m
dnsmasq-backdoor-detect-execvp-obfuscated 0%
$0.16 2m
dnsmasq-backdoor-detect-obfuscated 0%
$0.16 2m
dnsmasq-backdoor-detect-posix-spawn 0%
$0.15 1m
dnsmasq-backdoor-detect-posix-spawn-obfuscated 0%
$0.13 2m
dnsmasq-backdoor-detect-printf 0%
$0.15 2m
dnsmasq-backdoor-detect-syscall 0%
$0.14 1m
dnsmasq-backdoor-detect-syscall-obfuscated 0%
$0.13 2m
dropbear-brokenauth-detect 0%
$0.22 2m
dropbear-brokenauth-detect-nologline 0%
$0.15 2m
dropbear-brokenauth2-detect 0%
$0.22 2m
lighttpd-backdoor-detect-open 0%
$0.14 1m
lighttpd-backdoor-detect-proc-obfuscated 0%
$0.13 2m
lighttpd-backdoor-multiple-arch-binaries-detect 0%
$0.33 3m
lighttpd-backdoor-multiple-binaries-detect 0%
$0.59 5m
sozu-backdoor-multiple-arch-binaries-detect 0%
$0.33 3m
sozu-backdoor-multiple-binaries-detect 0%
$8.78 81m
sozu-timebomb-multiple-binaries-detect 0%
$4.98 45m

All product names, logos, and brands (™/®) are the property of their respective owners; they're used here solely for identification and comparison, and their use does not imply affiliation, endorsement, or sponsorship.