Protocol reverse engineering: make old machines talk again
The most common problem in manufacturing & automation: the old machine still runs — but the new software can’t talk to it. The controller vendor no longer exists. Protocol documentation: none. Our solution: we read the protocol ourselves.
Why proprietary protocols exist
In the 1990s and 2000s, every vendor built their own communication protocol — often to enforce lock‑in. RS‑232, RS‑485, CAN bus, or proprietary Ethernet extensions carried the bytes. The format was known only to the manufacturer.
Today, 20–30 years later, that vendor is often gone, acquired, or the product is discontinued. The machine still runs, but it’s “silent” — it speaks a language nobody understands anymore.
Phase 1: passive traffic analysis
Before we send a single byte, we listen. Passive monitoring means we tap the line without changing the machine’s behavior.
# Typical RS‑232 capture setup
Machine ──RS‑232── [Y‑cable] ── USB‑Serial Adapter ── Capture PC
# Original terminal stays connected
Machine ──RS‑232── [Original Terminal] (normal operation)
We typically capture 48–72 hours of real production traffic. Many patterns only appear during specific events: shift changes, tool changes, error states, production starts.
Phase 2: pattern analysis & protocol reconstruction
After capture, the real work starts. We look for:
- start/end sequences
- length fields
- checksums (CRC8/CRC16/XOR)
- command codes (opcodes)
A real example: decoding a CNC protocol
In case study MFG‑001 we reconstructed a frame format like this:
# Decoded frame format (22 bytes)
0x00 SOH (0x01) — start of header, constant
0x01 CMD — command type (0x10=status, 0x20=move, 0x30=stop …)
0x02 SEQ — sequence number (1 byte, wraps)
0x03 LEN — payload length in bytes
0x04 PAYLOAD[LEN] — variable payload
N-2 CRC16_HI — CRC16/MODBUS high byte
N-1 CRC16_LO — CRC16/MODBUS low byte
N EOT (0x04) — end of transmission
Timing was critical: the controller expected an answer within 50 ms. If we exceeded that, the connection timed out and reset. This requirement only shows up in real‑world measurements — not static analysis.
Phase 3: the custom protocol bridge
With the protocol fully documented, we implement a parser in Node.js. Key design choices:
- fault tolerance first (discard bad frames, keep scanning; never crash)
- full logging (decoded + raw hex)
- automatic reconnect (reset state correctly)
What can go wrong (and how we avoid it)
- protocol variants: capture multiple firmware versions
- hidden edge cases: trigger maintenance/error scenarios intentionally
- timing dependencies: measure with an oscilloscope, not just software capture
- polling vs events: understand the communication model early
Conclusion
In our experience, 90%+ of proprietary industrial protocols can be documented end‑to‑end within 1–2 weeks when you follow a rigorous method. The result: a machine that used to be “silent” suddenly talks again — to your modern software, in real time, at a fraction of the cost of replacing the machine.