Wiznet makers

viktor

Published June 11, 2026 ©

164 UCC

20 WCC

48 VAR

0 Contests

0 Followers

0 Following

Original Link

BetterFrame ioBOX: A PoE-Powered ESP32-S3 + WIZnet W5500 IO Controller for Smart Display Systems

BetterFrame ioBOX: A PoE-Powered ESP32-S3 + WIZnet W5500 IO Controller for Smart Display Systems

COMPONENTS Hardware components

WIZnet - W5500

x 1


PROJECT DESCRIPTION

Introduction

Most digital signage and camera-wall systems treat physical IO as an afterthought — a webhook, maybe a relay board bolted to the side of a server rack. BetterFrame takes a different approach. Alongside its Raspberry Pi 5 display engine, the project ships a purpose-built IO controller called the ioBOX: an ESP32-S3 board with WIZnet W5500 Ethernet, 802.3af PoE input, four USB-A host ports for HID devices, RS-485, and GPIO expansion — all on a single RJ-45 cable.

This article focuses on the ioBOX hardware and firmware design, with a general overview of the BetterFrame system it connects to. The full project is open-source at github.com/BetterCorp/BetterFrame, dual-licensed under AGPL-3.0 or a commercial licence from BetterCorp.


BetterFrame in Brief

AI Generated Image

BetterFrame is a multi-camera display management system built for Raspberry Pi 5. It manages RTSP and ONVIF camera streams in flexible grid layouts, supports up to 32 cameras per display, and provides a full web-based admin UI. A native Rust kiosk application (GTK4 + GStreamer) runs on the Pi and owns the hardware video decoder pool directly — the reason layout switches have zero perceived latency. The server is TypeScript/Node.js built on the BSB v9 plugin framework, with SQLite for persistence, htmx for the admin UI, and Node-RED for automation rules and dashboard content.

The system separates concerns cleanly: the server coordinates, the kiosk renders, and — the subject of this article — the ioBOX handles physical IO at the edge.


What the ioBOX Does

The ioBOX sits on the same LAN as a BetterFrame kiosk and acts as a physical-world bridge. It detects events — motion from a PIR sensor, a button press, a barcode scan over RS-485, a USB keyboard or HID controller — and translates them into structured events that trigger layout switches, Node-RED flows, or custom automation rules.

A concrete example: a visitor walks into a lobby. The PIR sensor on the ioBOX fires a presence event. That event matches an IO mapping on the server, which calls the kiosk's local API directly over LAN, switching the display from an idle screensaver layout to the active welcome layout — in one network hop, without routing through the server at all.

The ioBOX comes in two firmware builds:

  • iobox_wifi — provisions over a captive AP, then connects as Wi-Fi STA.
  • iobox_eth — uses WIZnet W5500 hardwired Ethernet. If Ethernet succeeds at boot, Wi-Fi is never enabled.

Hardware: ioBOX PoE USB4

The reference hardware design (hardware/iobox-poe-usb4/) is a Rev A board built around six functional blocks: MCU, Ethernet, PoE power, USB hub, IO headers, and per-port USB current switches. It is designed for KiCad/Eagle/EasyEDA capture, with a full BOM, netlist, and pin map committed to the repository.

Main ICs

RefPartFunction
U1ESP32-S3-WROOM-1-N16(R8)MCU — dual-core Xtensa, 16 MB flash, native USB OTG
U2GL850G-HHY224-port USB 2.0 hub downstream
U3WIZnet W5500SPI hardwired TCP/IP Ethernet controller
U4Silvertel AG9905M / AG9905-MTB802.3af PoE PD — isolated 5 V / 9 W output
U5Buck or high-current LDO5 V → 3.3 V, ≥800 mA
U6–U9TPS2553 / AP22802 classPer-port USB VBUS current switches (×4)
U10–U14USBLC6-2SC6 classUSB ESD protection — one upstream, one per port

The board is designed for a 4-layer PCB with a solid ground plane and no splits under USB differential pairs.

Power Architecture

Everything runs from a single PoE RJ-45:

RJ-45 PoE input
  → Ethernet magnetics → W5500 SPI Ethernet → ESP32-S3
  → PoE center taps / spare-pair power
      → AG9905M isolated 5 V PD module
          → +5V_SYS
              → U5: 3.3 V regulator → +3V3 (ESP32-S3, W5500)
              → GL850G hub (V5 + V33 per hub reference design)
              → U6–U9: per-port VBUS switches → 4× USB-A ports

The AG9905M is an isolated module, meaning the PoE power domain is electrically separated from the system 5 V rail — important for the Ethernet isolation requirements of 802.3af. The board notes a key power budget constraint: four USB ports at 500 mA each exceed a realistic 802.3af 9 W system budget once the ESP32-S3, W5500, and hub are included. Per-port USB current limits should be set to 150–250 mA for HID use, not full charging levels.


The WIZnet W5500: Why It's Here

The W5500 is WIZnet's hardwired TCP/IP Ethernet controller — the TCP/IP stack, MAC, and 10/100 PHY are all implemented in hardware silicon, not firmware. The ESP32-S3 communicates with it purely over SPI and issues socket-level commands; it never touches Ethernet framing, ARP, DHCP, or TCP state machines directly.

For the ioBOX specifically, this matters for three reasons.

PoE power budget. The W5500 consumes around 130 mW active — a small fraction of the 802.3af 9 W envelope. A module with a software TCP/IP stack running on the MCU would burn more cycles and more power. The W5500's hardware offload keeps the ESP32-S3 free for USB HID enumeration, RS-485 parsing, JSON serialization, and OTA management without adding power pressure on an already tight budget.

Predictable boot-time Ethernet. The ioBOX's boot networking rule is strict: if the stored mode is ethernet, the firmware attempts Ethernet only. If the link disconnects after boot, it retries Ethernet indefinitely. It never falls back to Wi-Fi mid-session. For a physical IO controller in a commercial display installation, this predictability is the point — you don't want the device silently switching to Wi-Fi and becoming invisible to the LAN-based kiosk discovery mechanism.

TLS tradeoff acknowledged. The standard Arduino W5500 Ethernet library (arduino-libraries/Ethernet) does not provide TLS. The firmware README documents this explicitly: the iobox_eth variant uses plain HTTP against the BetterFrame server, and the deployment recommendation is to either put the ioBOX on a trusted internal network or terminate TLS upstream (Angie proxy is already in the stack). The iobox_wifi variant uses WiFiClientSecure with mbedTLS for HTTPS. This is an honest engineering tradeoff, not an oversight.

W5500 Pin Map

The SPI connection between ESP32-S3 and W5500 is locked between the hardware design and the platformio.ini build flags — the same GPIO numbers appear in both:

ESP32-S3 GPIONetW5500 signal
GPIO10ETH_CSSCSn (chip select)
GPIO11SPI_MOSIMOSI
GPIO12SPI_SCKSCLK
GPIO13SPI_MISOMISO
GPIO14ETH_INTINTn (interrupt)
GPIO15ETH_RSTRSTn (reset)

Hardware notes from the design: the W5500 uses a 25 MHz crystal with datasheet load capacitors, a 10 kΩ pull-up on reset, and optional 0Ω series links on SPI lines for bring-up. It should be placed close to the RJ-45 magjack with Ethernet differential pairs (ETH_TXP/TXN, ETH_RXP/RXN) kept short and length-matched. The W5500-to-magjack data routing must stay away from the ESP32-S3 antenna keepout area.

USB Host Path

The ESP32-S3 native USB OTG (GPIO19=D−, GPIO20=D+) connects to the GL850G hub upstream port. Four downstream ports feed individual USB-A connectors through USBLC6-2SC6 ESD devices and TPS2553-class VBUS current switches. GPIO19 and GPIO20 are dedicated to USB host and are explicitly excluded from the GPIO expansion headers. The GL850G is strapped for self-powered operation and uses a 12 MHz crystal.

GPIO and RS-485 Headers

Two 2×5 2.54 mm headers expose the remaining GPIOs:

J_GPIO1 — 3.3 V GPIO: GPIO4/5 (ADC/touch capable), GPIO6/7/8/9/21 (general purpose).

J_GPIO2 — Power/serial: 5 V system rail (fused/limited), GPIO16 (RS-485 DE / spare), GPIO17 (RS-485 TX / UART TX), GPIO18 (RS-485 RX / UART RX), GPIO38, GPIO45/46 (strap-sensitive, use carefully), GPIO47/48 (I2C SDA/SCL with optional pull-ups).

A UART programming header (U0TXD, U0RXD, EN, GPIO0) handles flashing, since GPIO19/20 are consumed by USB host and cannot be used for the native USB programming path on Rev A.


Firmware Deep-Dive

The firmware is a single C++ source file (iobox-firmware/src/main.cpp) built with PlatformIO for the espressif32 platform and arduino framework, targeting esp32-s3-devkitc-1. Dependencies are minimal: bblanchon/ArduinoJson@^7.4.2 and arduino-libraries/Ethernet@^2.0.2.

Boot Networking

chooseNetworkAtBoot() runs once in setup() and determines the network mode for the entire session:

stored mode = "ethernet"  → call beginEthernet(); done
stored mode = "wifi_sta"  → call beginWifiSta(); done
no stored mode, eth build → try beginEthernet()
  success → storeMode("ethernet"); done
  fail    → startProvisioningPortal() for 5 minutes
              success → storeMode("wifi_sta"); restart

beginEthernet() calls SPI.begin() on the four SPI pins, Ethernet.init(ETH_CS_PIN), derives a locally administered MAC from the ESP32 efuse MAC (02:BF:10:XX:XX:XX using the low three bytes of the chip MAC), calls Ethernet.begin(mac), then polls for linkStatus() == LinkON and a non-zero DHCP-assigned IP for up to 12 seconds before returning success or failure.

Changing network mode requires a factory reset — the mode is stored in NVS (Preferences, namespace bf-iobox) and only written by explicit provisioning or auto-detection.

Server Handshake: Announce → Pair → Config

Once networkUp is true, setup() runs a three-step server handshake:

1. Announce — POST /api/iobox/announce (unauthenticated) with:

{   "serial": "IOB-XXXXXXXX",   "model_hint": "ioBOX-ETHERNET",   "firmware_version": "0.1.0",   "firmware_arch": "esp32s3",   "network_mode": "ethernet" }

The serial is derived from the ESP32 efuse MAC as IOB-<4 high hex><8 low hex>. If the server returns "status": "unknown_serial", the device isn't registered yet and the handshake stops.

2. Pair/claim — if no iobox_key is stored in NVS, POST /api/iobox/pair/claim. The server returns an iobox_id and iobox_key, which are persisted to NVS. All subsequent requests carry Authorization: Bearer <iobox_key>.

3. Config pull — GET /api/iobox/config, which returns:

  • assigned_display.id — which display this ioBOX serves
  • local_target.candidates[] — list of {ip, port} for direct kiosk LAN access
  • local_target.local_key — bearer token for the kiosk's local API
  • mappings[] — IO event → action rules (e.g. presence event → layout.switch)

Main Loop

loop() runs four periodic tasks after a successful handshake:

TaskIntervalFunction
Heartbeat30 sPOST /api/iobox/heartbeat with IP and firmware version
Config refresh60 sRe-pull config and re-check local kiosk reachability
OTA check5 minGET /api/iobox/firmware/check, download + apply if newer
Hardware poll25 msRead PIR, button, RS-485 UART; emit events on changes

The network connection is maintained in maintainSelectedNetwork() every iteration — WiFi.reconnect() for the Wi-Fi path, Ethernet.maintain() for DHCP lease renewal on the W5500 path.

Direct Kiosk Path — The Fast Lane

The most important runtime behavior is the direct kiosk local path. After config pull, the firmware probes the first candidate kiosk IP:

GET http://<kiosk_ip>:18090/local/iobox/check?key=<local_key>

If this returns 2xx, localKioskReachable = true. From that point on, every IO event goes directly to the kiosk over LAN first:

POST http://<kiosk_ip>:18090/local/iobox/event?key=<local_key>

For layout.switch mappings specifically, the firmware calls the kiosk layout API directly:

GET http://<kiosk_ip>:18090/local/layout/<layout_id>?key=<local_key>

This is one LAN hop. A PIR sensor triggering a layout switch does not round-trip through the BetterFrame server at all — it goes ioBOX → kiosk at LAN speed. The event is also forwarded to the server with "route": "direct" for audit logging. If the kiosk is unreachable, the event falls back to POST /api/iobox/event on the server with "route": "proxy".

IO Event Schema

All hardware inputs funnel through emitIoEvent(kind, payload), which constructs a consistent event document:

{   "event_id": "IOB-XXXXXXXX-42",   "kind": "presence",   "display_id": "...",   "occurred_at_ms": 183420,   "payload": { "state": "present" },   "local_handled": true,   "route": "direct" }

The three currently implemented input types:

PIR sensor (BF_PIR_PIN ≥ 0) — polls digitalRead() every 25 ms, emits presence with state: "present" or state: "clear" on transitions. Ignores the first read (state initializes to −1) to avoid spurious boot events.

Button (BF_BUTTON_PIN ≥ 0, INPUT_PULLUP) — emits button with code: "button_1" and state: "down" or state: "up" on transitions.

RS-485 UART (BF_RS485_RX_PIN and BF_RS485_TX_PIN both ≥ 0) — line-buffers Serial1 at 9600 baud, emits rs485 with raw and code set to the trimmed line on each newline. RS-485 DE pin (BF_RS485_DE_PIN) is driven LOW at boot for receive-only default. Buffer caps at 180 characters before reset.

The firmware source explicitly marks USB HID host and binary Pelco PTZ decoding as the next addition points — inside pollHardware() without changing the server API contract.

IO Mappings

The config pull includes a mappings[] array. Each mapping has:

  • source_kind — the event kind to match (e.g. "presence")
  • match_json — key/value pairs to match against the event or its payload
  • action — what to do (currently: "layout.switch")
  • params_json — action parameters (e.g. {"layout_id": "abc123"})
  • enabled — boolean guard

runLocalMappings() iterates all enabled mappings against each event, calls runLocalMapping() for each match, and returns whether any local action was executed. The matching is simple JSON-value equality, serialized to string for comparison — no regex, no scripting, intentionally minimal.

OTA Updates

The OTA flow:

  1. GET /api/iobox/firmware/check?current=0.1.0&arch=esp32s3&model_id=<model_id>
  2. If up_to_date: true, return. Otherwise read download_url, version, sha256.
  3. Download the binary via EthernetClient (W5500 path, plain HTTP) or HTTPClient (Wi-Fi path, HTTP or HTTPS).
  4. Stream into streamUpdateWithSha(): pipes bytes through mbedtls_sha256_context while writing to Update. Verifies SHA-256 digest before calling Update.end().
  5. POST /api/iobox/firmware/applied with version and any error string.
  6. ESP.restart() on success.

The SHA-256 verification before Update.end() means a corrupted download aborts cleanly rather than flashing a bad image. Content-Length is used for known-size downloads; the code falls back to UPDATE_SIZE_UNKNOWN when the header is absent.


How It All Connects: ioBOX in the BetterFrame Stack

[PIR / button / RS-485 / USB HID]
        ↓ GPIO / UART / USB
  [ioBOX: ESP32-S3 + W5500]
        ↓ LAN (direct, 1 hop)         ↓ WAN/LAN (via server)
  [Kiosk: Pi 5 Rust app]        [BetterFrame server: Node.js/BSB]
        ↓ GStreamer                    ↓ Node-RED
  [Display: HDMI]               [Automation / webhooks / dashboards]

The ioBOX talks to the kiosk directly for latency-sensitive actions (layout switches, presence-triggered content changes) and to the server for everything that needs persistence, audit, or cross-device coordination. The server never touches RTSP. The kiosk never touches the database. The ioBOX never opens an RTSP connection. Each layer does exactly one job.


What Developers Can Take From This

ESP32-S3 + W5500 as a PoE IO node pattern. The combination of ESP32-S3 native USB host, W5500 hardwired Ethernet, and a PoE PD module on a single board is directly reusable for any application that needs wired-network-connected physical IO with USB HID input — industrial panels, kiosks, interactive installations, access control. The firmware architecture (boot mode selection, NVS persistence, server handshake, direct LAN path) is clean enough to extract and adapt.

Single-cable PoE deployment. Mounting an IO box near a display and running a single Ethernet cable back to a PoE switch is significantly simpler than running separate power and data lines, particularly in ceiling or wall-mount scenarios. The AG9905M isolated module handles the PoE PD compliance; the rest of the board just sees 5 V.

Direct device-to-device LAN path. The pattern of pulling a local device IP and key from a central config server, then talking directly to that device for latency-sensitive operations, is broadly useful. It avoids the server becoming a bottleneck for real-time events while still keeping a single source of truth for configuration and audit logs.

IO event schema as a stable API surface. The emitIoEvent() structure — event ID, kind, display ID, timestamp, payload, route — is designed so that new hardware inputs (USB HID, Pelco PTZ, I2C expanders) can be added without changing how the server or kiosk consumes events. The schema is the contract; the hardware polling is the implementation.

Explicit TLS tradeoffs for embedded Ethernet. The firmware README's honest acknowledgment that W5500 plain HTTP requires a trusted network or upstream TLS termination is the kind of documentation that saves developers from deploying insecurely "for now." The dual-path architecture (TLS on Wi-Fi, plain HTTP on W5500 + proxy termination) is a reasonable model for mixed-trust deployments.


Getting Started

Clone the repository and build the ioBOX firmware:

git clone https://github.com/BetterCorp/BetterFrame
cd BetterFrame/iobox-firmware
 # Wi-Fi variant pio run -e iobox_wifi
 # Ethernet variant (W5500) pio run -e iobox_eth

Override deployment values with PlatformIO build flags or a local_overrides.ini:

[env:iobox_eth] build_flags =   ${env.build_flags}
  -D BF_ETHERNET_VARIANT=1   -D BF_DEFAULT_SERVER_URL=\"https://betterframe.yourdomain.com\"   -D BF_MODEL_HINT=\"ioBOX-ETHERNET\"   -D BF_ETH_CS_PIN=10   -D BF_ETH_SPI_SCK_PIN=12   -D BF_ETH_SPI_MISO_PIN=13   -D BF_ETH_SPI_MOSI_PIN=11   -D BF_PIR_PIN=5   -D BF_RS485_RX_PIN=18   -D BF_RS485_TX_PIN=17   -D BF_RS485_DE_PIN=16

For the full BetterFrame server stack:

cd BetterFrame
npm install # edit sec-config.yaml npm run dev   # in server/

Or via Docker Compose:

docker compose -f docker-compose.coolify.yml up
Documents
Comments Write