Guide · TCP probe
TCP port monitoring for non-HTTP services
For everything that isn't HTTP — SMTP relays, SSH bastions, Redis, RabbitMQ, your in-house binary protocol — a TCP port check is the cheapest signal you can put on the wire. It's also the easiest to misread. This guide is the practical version: what TCP-up actually tells you, when a banner-grab saves you from a deploy that looked fine, and where you have to reach for a real protocol probe instead.
Why TCP-up doesn't mean app-up
The single most important sentence in this article: a successful TCP handshake says the operating system on the other end completed three packets. Nothing more. The three-way handshake — SYN, SYN-ACK, ACK — happens inside the kernel, well below the layer where your application is (or isn't) doing useful work.
Three real production shapes where the TCP probe will cheerfully say "Up" while the service is broken:
-
The accept queue is full. The kernel's
listen()backlog is sized for the worst day. A worker that's deadlocked on a mutex isn't callingaccept()any more, but the kernel keeps completing handshakes and enqueuing the sockets. Your probe connects, gets a clean handshake, sends nothing, closes — green. Your real clients connect, wait for the first byte that never comes, and time out. - Load balancer health-check loopback. Some LBs answer TCP health checks themselves, off the backend pool. The pool can be empty and the LB's port will still accept connections. AWS NLB with TCP target health, certain HAProxy modes, F5 fast-L4 monitors — all of them can return a "port open" verdict that doesn't depend on a single backend being alive.
- OS accepted, server not reading. The app process is alive, the listening socket is alive, the kernel accepted the connection — but the accept-loop thread is blocked in a 30-second DNS lookup, a stuck SQL query, or a sleep that shouldn't exist. A TCP probe sees the socket. It does not see the deadlock behind it.
That said: TCP-down is still a strong signal in the other direction. A refused connection or a handshake timeout almost always means something real — firewall change, daemon crashed, port re-bound by the wrong process, kernel out of file descriptors. False positives are common; false negatives on TCP-down are rare. Treat "the port closed" as worth waking someone, and "the port answers" as one piece of evidence among several.
What TCP port monitoring actually tells you
Mechanically, every iteration of a TCP probe is:
- Resolve the hostname to an IP.
- Open a TCP socket to
host:port, time the handshake. - Optionally read the first bytes the server sends (the banner).
- Close cleanly.
The handshake time is the one numeric metric — the connect time — and it includes the TLS handshake when implicit TLS is on. It's a pure transport metric: distance to the server, network jitter, NIC queue depth, socket accept-queue latency on the server. It does not include any application-level work. If your app's first real response takes 800 ms to compute, the TCP probe will never see that.
From the wire-level view, you can reproduce a probe's connect attempt with two commands every backend engineer has on hand:
# Plain port check — does it answer?
nc -zv mail.example.com 25
# Banner grab — what does it say first?
nc mail.example.com 25
# >>> 220 mail.example.com ESMTP Postfix
The probe is doing exactly this, on a schedule, from a
known network location, with the latency captured and the
banner compared against a substring you supplied. If
nc -zv works from your laptop and the probe
says Down, the difference is almost always either the
source-IP filter (cloud security group, fail2ban rule) or
DNS — the probe resolves from its own network, not yours.
For services that are HTTP, this is the wrong shape of probe. A TCP check against port 443 will go green even when every HTTP request returns 500, or the certificate is three weeks expired. Use the HTTP probe guide for HTTP / HTTPS endpoints — TCP port monitoring is for services where there is no HTTP semantics to assert.
Where it's the right tool
The protocols where TCP port monitoring earns its place are the ones that are either too cheap to write a custom probe for, or too obscure for vendor-supplied ones. A non-exhaustive shortlist that covers almost every real use case:
-
SMTP relays (smtp port monitoring).
Port 25 for MX, port 465 for SMTPS, port 587 for
submission. The TCP probe + a
220 ESMTPbanner assertion catches "port up, wrong daemon", which happens more often than you'd think after a misrouted LB rule. For end-to-end "can mail actually be delivered?", the TCP probe stops short — see the email round-trip monitoring guide for the send-from-A-receive-at-B flow. -
SSH bastions (ssh banner monitoring).
Port 22. The banner is deterministic — every OpenSSH
version starts with
SSH-2.0-OpenSSH_followed by the version. A banner assertion ofSSH-2.0distinguishes a real sshd from a TCP forwarder that just keeps the socket open. -
Redis (redis tcp check). Port 6379 by
default. Redis doesn't emit a banner on connect, so a
plain TCP probe only proves the port is listening. For
a stronger check, layer a Heartbeat from your app that
pings Redis with
PING, or use the Database probe (Redis is a Tier-1 engine there). - Message brokers (rabbitmq port monitor). RabbitMQ AMQP on 5672, management UI on 15672, MQTT on 1883, Kafka brokers on 9092. The control ports (5672, 15672) are good TCP-probe targets — they confirm the broker process is up and bound. For Kafka data ports, TCP up is necessary but not sufficient; pair with a producer / consumer heartbeat from a small app.
- Database engines when you can't or won't run a real query: Postgres 5432, MySQL 3306, SQL Server 1433. TCP gives you "the listener is up". For Levels 2 and 3 (real query, version assertion), use the dedicated Database probe, which speaks each engine's protocol natively.
- FTP and other legacy services. Port 21 emits a 220 greeting; perfect for a banner check. LDAP on 389 / LDAPS on 636 — same shape.
- Custom binary protocols. Anything you wrote in-house that listens on a TCP port without emitting a parseable banner: a feature flag server, a shard-routing daemon, a metrics aggregator. TCP-up is often the only external signal that exists.
Connect-time threshold = Degraded
A clean handshake to a same-region host completes in single-digit milliseconds. Cross-region, you're typically in the 50-150 ms band. When connect time starts climbing without the topology changing, the cause is almost always one of:
-
Server-side accept queue saturation.
The listening socket's backlog is filling up faster
than the app calls
accept(). SYN-ACKs start being delayed; new connections wait. Visible inss -lnton Linux as a non-zero Recv-Q on the listening socket. - Upstream NAT or conntrack exhaustion. A busy LB or NAT gateway running out of source-port tuples. Common at hyperscalers when a single client IP creates thousands of short-lived connections.
- Network-path congestion. Genuine WAN degradation, peering issues, or a saturated uplink at one end. Rarer than the other two but real.
Set the Degraded-above threshold a comfortable multiple above your typical connect time. For a same-region plaintext port, 1000 ms (StatusPulse's default) is loose enough not to false-alarm and tight enough to catch a saturated socket queue. For TLS ports, add 200-300 ms of headroom — the TLS handshake adds key exchange and certificate verification to the same window. Degraded isn't an outage signal; it's an early warning that something's getting full upstream of the app.
Banner-grab assertions
A bare TCP probe asks "did the kernel accept a connection?". A banner-grab asks the much harder question "is the right protocol listening on this port?". Many text-based protocols emit a deterministic first line at connect:
# SSH — every OpenSSH starts this way
nc bastion.example.com 22
# >>> SSH-2.0-OpenSSH_9.6p1 Ubuntu-3ubuntu13.5
# SMTP — RFC 5321 mandates 220 + greeting
nc mail.example.com 25
# >>> 220 mail.example.com ESMTP Postfix (Debian/GNU)
# FTP — same 220 convention
nc ftp.example.com 21
# >>> 220 (vsFTPd 3.0.5)
# POP3 — +OK on connect
nc pop.example.com 110
# >>> +OK Dovecot ready.
# IMAPS (implicit TLS) — wrap with openssl, then read
openssl s_client -connect imap.example.com:993 -quiet
# >>> * OK [CAPABILITY IMAP4rev1 ...] Dovecot ready.
A StatusPulse TCP probe with Banner contains set
to SSH-2.0 or 220 ESMTP or
+OK reads the first ~256 bytes after connect
and fails the probe if the substring isn't present. That
single feature is the difference between "the port
answered" and "the right protocol is listening on the
port", and it's the cheapest deploy-regression guard you
can put on a non-HTTP service. The textbook regression
that banner-grab catches: a load balancer config push
that accidentally routes port 25 to a backend that's now
answering HTTP on the same port. TCP-only check: green.
Banner check expecting 220 ESMTP: red within
a minute.
Redis is the awkward case in the banner list: it doesn't
send anything on connect. To get a banner-shaped signal,
you have to send PING\r\n first and read
+PONG\r\n back — which is a full protocol
exchange, not a passive read. For Redis specifically, the
stronger move is the dedicated Database probe (Tier-1
engine), not a TCP banner check.
TLS-wrap pitfalls (implicit TLS vs STARTTLS)
This is the section to read carefully if you've ever wondered why a TCP-with-TLS probe fails against a port you can connect to with your mail client. There are two different TLS-on-TCP shapes in the wild, and a TCP probe can only handle one of them.
Implicit TLS (also called "TLS on connect"
or "wrapped TLS") starts the TLS handshake at byte zero.
The first packet from the client is a TLS
ClientHello; the server replies with a
ServerHello; the application protocol
(SMTP, IMAP, LDAP) begins inside the encrypted stream.
This is what SMTPS 465, IMAPS 993, and LDAPS 636 do. A
TCP probe with Use TLS on works against these,
because it calls SslStream.AuthenticateAsClient
immediately after the TCP connect, which is exactly what
the server expects. You can replicate it from the
terminal:
openssl s_client -connect imap.example.com:993 -servername imap.example.com
# >>> CONNECTED, full handshake, then:
# >>> * OK [CAPABILITY ...] Dovecot ready.
STARTTLS is the opposite shape and it is
not what the TCP probe does. STARTTLS
ports start in plaintext. The client and server exchange a
few protocol messages in cleartext, the client sends
STARTTLS (SMTP), STLS (POP3),
STARTTLS (IMAP), or the LDAP extended op, and
then the TLS handshake begins on the same socket.
SMTP submission on 587, IMAP on 143, POP3 on 110, LDAP on
389 are all STARTTLS by default. From the terminal:
openssl s_client -connect mail.example.com:587 -starttls smtp
# >>> 220 mail.example.com ESMTP (plaintext)
# >>> EHLO openssl.client (sent by openssl)
# >>> 250-STARTTLS (server advertises)
# >>> STARTTLS (sent by openssl)
# >>> 220 2.0.0 Ready to start TLS
# >>> [TLS handshake begins]
Notice the -starttls smtp flag — openssl
speaks the protocol's STARTTLS dance on your behalf. A
generic TCP probe cannot do that, because the negotiation
is protocol-specific: SMTP's STARTTLS is different from
IMAP's, which is different from LDAP's, which is different
from XMPP's. The TCP probe's Use TLS toggle does
TLS-from-byte-zero only.
Practical rules:
- Port 465 / 993 / 636 (implicit TLS): turn Use TLS on, banner-grab works inside the encrypted stream. The TCP probe handles these.
-
Port 587 / 143 / 110 / 389 (STARTTLS):
leave Use TLS off and assert only on the
plaintext banner (
220 ESMTP,* OK,+OK). You're checking that the port speaks the right protocol; the TLS negotiation that happens after that needs a real protocol probe — for SMTP submission specifically, the email round-trip probe covers the full STARTTLS dance plus actual mail delivery. - Don't try to certificate-monitor with TCP+TLS. The TCP probe with Use TLS completes the handshake but doesn't extract days-until-expiry. That's the dedicated SSL probe's job.
Common failure modes
A short field guide to the failures that look like something they aren't:
- Firewall opens the port but rejects with RST. Cloud security groups and host firewalls can be configured to return TCP RST on disallowed source IPs instead of dropping silently. The probe sees "connection refused" — same status as a dead daemon, different cause. Check the security group / iptables rule for the StatusPulse source range before blaming the app.
-
LB health check green, real port closed.
The LB is answering TCP health on its own, off the
target pool. Sanity-check by pointing the probe at a
backend's private IP from inside the VPC if you can,
or by adding a banner assertion — the LB's bare TCP
accept won't satisfy
220 ESMTPeven when the LB itself is "healthy". -
Banner changed after server upgrade.
Postfix 3.7 → 3.8 doesn't change the banner format,
but Exim 4.95 → 4.96 has tweaked greetings before, and
custom servers (your in-house daemon) change banners
whenever someone refactors. Pin the assertion to the
stable prefix (
220 ESMTP, not220 mail.example.com ESMTP Postfix (Debian 3.7.2)), and review banner assertions after major-version upgrades. - Rate-limiting kicks in at 1-minute interval. Some servers (notably hardened SMTP relays and SSH bastions with fail2ban) count repeated TCP connects from the same source as a brute-force pattern. At a 1-minute probe interval, that's 1440 connects/day from a single IP — well within most thresholds, but not all. If your probe goes Down intermittently after running fine for a week, check the server's auth log for rate-limit refusals from the StatusPulse IPs and widen the allowlist.
- DNS resolves to a stale IP. The hostname resolves at probe time. If you've recently cut over to a new IP, TTL stragglers can keep the probe targeted at the old box for up to the previous record's TTL. Drop TTLs to 60 seconds before any cutover.
- Use TLS on, handshake fails on a STARTTLS port. Covered above — the most common configuration mistake with this probe. Toggle Use TLS off, the probe gets the plaintext greeting, the assertion succeeds.
For the full field reference on TCP probe settings, the TCP probe docs cover every field, default, and edge case in detail.
Wrap-up
TCP port monitoring is the right shape for everything below the HTTP layer: SMTP relays, SSH bastions, message brokers, database engines you can't or don't want to query, and every custom binary protocol you've ever written. It's cheap, it's universal, it's accurate about "the port is down" almost all the time. It is not accurate about "the app behind the port is working" — and the gap between those two statements is where banner-grab assertions, connect-time degraded thresholds, and complementary probes earn their place. Pair the TCP probe with a banner assertion whenever the protocol has a deterministic greeting; leave Use TLS off unless the port is implicit-TLS (465 / 993 / 636); and reach for a real protocol probe — email round-trip, HTTP, Database — when "the port answered" stops being enough.
Try StatusPulse's TCP probe
5 probes free; TCP probe from Starter ($5/mo). US or EU host — you choose.