Hunting through raw Zeek logs just got a massive upgrade.
If you’ve spent years in the SOC, you’ve likely built up a library of complex awk chains and grep commands to parse Zeek data. It works, but it’s brittle and hard to read. I recently used DuckDB and the zeek-duckdb extension to analyze a malware sample, and the difference is night and day. Instead of wrestling with syntax, I was able to run blazing-fast SQL queries against raw logs directly in my terminal.
The Power of the Join
The real magic happens when you treat your logs like relational data. By joining conn.log and http.log on the shared connection uid, you instantly combine application-layer context with network-layer ground truth.
In this hunt, I was looking for data exfiltration patterns. While a standard web log might show you a successful GET request, joining it with the connection log allows you to see exactly how many bytes left the network during that specific session.
The “Aha!” Moment
I wrote a query to sort by orig_bytes DESC to surface the largest data transfers. While this specific sample didn’t show massive exfiltration, stacking the data this way made an anomaly glaringly obvious: a request to icanhazip.com with a NULL User-Agent.
Legitimate services usually send User-Agents. Seeing a missing one hitting an external IP discovery site is textbook automated malware. The script is mapping its new victim network before beaconing out to C2. I found this in seconds with zero infrastructure setup.
keith.jones@Keiths-MacBook-Pro duckdb % duckdb
DuckDB v1.5.1 (Variegata)
Enter ".help" for usage hints.
memory D load zeek;
memory D SELECT
c.ts,
c.uid,
c.id_orig_h AS source_ip,
c.id_resp_h AS dest_ip,
h.method,
h."host", -- Using quotes because LinkedIn will auto hyperlink this
h.uri,
h.user_agent, -- Context: What tool is making the request?
c.orig_bytes, -- Network: How much data did the client send? (Hunting Exfil)
c.resp_bytes, -- Network: How much data did the server return?
h.status_code,
c.conn_state -- Network: Did the connection finish normally or get abruptly reset?
FROM read_zeek('conn.log') AS c
JOIN read_zeek('http.log') AS h USING (uid)
ORDER BY c.orig_bytes DESC; -- Sorting by data sent out to look for exfil
┌───────────────────────────────┬────────────────────┬────────────────┬────────────────┬─────────┬───────────────────────┬───────────────────────────────────┬──────────────────────────┬────────────┬────────────┬─────────────┬────────────┐
│ ts │ uid │ source_ip │ dest_ip │ method │ host │ uri │ user_agent │ orig_bytes │ resp_bytes │ status_code │ conn_state │
│ timestamp with time zone │ varchar │ inet │ inet │ varchar │ varchar │ varchar │ varchar │ uint64 │ uint64 │ uint64 │ varchar │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:46.86766-04 │ C9I1o32GZkDMCaqjU1 │ 192.168.100.15 │ 88.221.169.152 │ GET │ www.microsoft.com │ /pkiops/crl/Microsoft ECC Product │ Microsoft-CryptoAPI/10.0 │ 484 │ 1917 │ 200 │ RSTO │
│ │ │ │ │ │ │ Root Certificate Authority 2018. │ │ │ │ │ │
│ │ │ │ │ │ │ crl │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:46.86766-04 │ C9I1o32GZkDMCaqjU1 │ 192.168.100.15 │ 88.221.169.152 │ GET │ www.microsoft.com │ /pkiops/crl/Microsoft ECC Update │ Microsoft-CryptoAPI/10.0 │ 484 │ 1917 │ 200 │ RSTO │
│ │ │ │ │ │ │ Secure Server CA 2.1.crl │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:17.160462-04 │ CERXXs2SXczhdsjGd2 │ 192.168.100.15 │ 204.79.197.203 │ GET │ oneocsp.microsoft.com │ /ocsp/MFQwUjBQME4wTDAJBgUrDgMCGgU │ Microsoft-CryptoAPI/10.0 │ 253 │ 1444 │ 200 │ S1 │
│ │ │ │ │ │ │ ABBQ3L3//a6ADK8NraY2GXzVaYrHG4AQU │ │ │ │ │ │
│ │ │ │ │ │ │ b6t+2v+XQ3LsO2d33oJhNYhHQoUCEzMAA │ │ │ │ │ │
│ │ │ │ │ │ │ AAGb6JMMcOVb6sAAAAAAAY= │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:38.089325-04 │ Cgbswa2jFzUocQtifg │ 192.168.100.15 │ 23.11.41.157 │ GET │ ocsp.digicert.com │ /MFEwTzBNMEswSTAJBgUrDgMCGgUABBQ5 │ Microsoft-CryptoAPI/10.0 │ 240 │ 641 │ 200 │ S1 │
│ │ │ │ │ │ │ 0otx/h0Ztl+z8SiPI7wEWVxDlQQUTiJUI │ │ │ │ │ │
│ │ │ │ │ │ │ BiV5uNu5g/6+rkS7QYXjzkCEAz1vQYrVg │ │ │ │ │ │
│ │ │ │ │ │ │ L0erhQLCPM8GY= │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:17.095191-04 │ CqWV6W1GPWojiojYde │ 192.168.100.15 │ 23.11.41.157 │ GET │ ocsp.digicert.com │ /MFEwTzBNMEswSTAJBgUrDgMCGgUABBTr │ Microsoft-CryptoAPI/10.0 │ 236 │ 485 │ 200 │ S1 │
│ │ │ │ │ │ │ jrydRyt+ApF3GSPypfHBxR5XtQQUs9tIp │ │ │ │ │ │
│ │ │ │ │ │ │ PmhxdiuNkHMEWNpYim8S8YCEAjTxtAB8m │ │ │ │ │ │
│ │ │ │ │ │ │ y1oj8MfWpz/7Y= │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:26.479552-04 │ CwBv002K9i2nSr9PW1 │ 192.168.100.15 │ 23.216.77.30 │ GET │ crl.microsoft.com │ /pki/crl/products/MicRooCerAut201 │ Microsoft-CryptoAPI/10.0 │ 216 │ 1267 │ 200 │ S1 │
│ │ │ │ │ │ │ 1_2011_03_22.crl │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:23:26.599969-04 │ CJNCiO1k6X4OK5cB6g │ 192.168.100.15 │ 23.59.18.102 │ GET │ www.microsoft.com │ /pkiops/crl/MicSecSerCA2011_2011- │ Microsoft-CryptoAPI/10.0 │ 209 │ 1359 │ 200 │ S1 │
│ │ │ │ │ │ │ 10-18.crl │ │ │ │ │ │
├───────────────────────────────┼────────────────────┼────────────────┼────────────────┼─────────┼───────────────────────┼───────────────────────────────────┼──────────────────────────┼────────────┼────────────┼─────────────┼────────────┤
│ 2026-04-14 09:24:04.219286-04 │ CeRSW815NsWYrVvGod │ 192.168.100.15 │ 104.16.184.241 │ GET │ icanhazip.com │ / │ NULL │ 63 │ 584 │ 200 │ S1 │
└───────────────────────────────┴────────────────────┴────────────────┴────────────────┴─────────┴───────────────────────┴───────────────────────────────────┴──────────────────────────┴────────────┴────────────┴─────────────┴────────────┘
memory D
Conclusion
By moving from text-parsing to SQL-querying, we stop fighting the logs and start asking better questions. Whether you are doing local IR or proactive threat hunting, the combination of Zeek’s visibility and DuckDB’s speed is a formidable addition to any toolkit.
Resources
- Malware Sample: ANY.RUN Task
- The Extension: zeek-duckdb on GitHub
Leave a Reply