Tailscale Funnel Stuck in “Background Configuration Already Exists” State

I’m running into a persistent issue with Tailscale Funnel + Serve where the tailnet believes a “background configuration already exists,” even though the local machine shows no serve or funnel configuration at all.

This happens across clean Ubuntu installs, fresh Tailscale installs, new node IDs, and after fully deleting all local Tailscale state. At this point it seems likely that the stuck config is stored in the tailnet control plane, not on the device.

1. The Core Problem

‘tailscale funnel --https=443 on’

always returns:

‘background configuration already exists’

However, the device reports no active Funnel or Serve configuration:

‘tailscale serve status’

‘tailscale funnel status’

‘tailscale status --json | jq .ServeConfig’

All return:

‘No serve config’

‘null’

Despite this, Tailscale still directs me to a Funnel UI URL for old, deleted node IDs, such as:

https://login.tailscale.com/f/funnel?node=<old_node_id>

2. Local State Has Been Fully Deleted

I wiped the machine repeatedly and removed all known Tailscale state:

‘/var/lib/tailscale/tailscaled.state’

‘/root/.config/tailscale’

‘/var/cache/tailscale’

‘/run/tailscale/*’

‘/etc/apt/sources.list.d/tailscale.list’

‘/usr/share/keyrings/tailscale-archive-keyring.gpg’

Then:

• Reinstalled Tailscale

• Re-authenticated using a fresh authkey

• Verified the device receives a new node ID

‘tailscale’ serve works normally:

‘tailscale serve --bg 9071’

‘tailscale funnel’ still claims the phantom background config exists.

3. All Reset Paths Fail

LocalAPI reset attempts:

curl --unix-socket /var/run/tailscale/tailscaled.sock \

-X POST http://local-tailscale.sock/localapi/v0/serve/reset

Outputs:

‘invalid localapi request’

Public API reset attempts:

All variants return 404:

‘POST /api/v2/tailnet//serve/reset’

‘POST /api/v2/tailnet//serve/config/reset’

‘POST /api/v2/device//serve/reset’

Example:

‘curl -X POST \

-H “Authorization: Bearer <API_KEY>” \

https://api.tailscale.com/api/v2/tailnet//serve/config/reset’

‘404 page not found’

4. Evidence That the Stuck Config Is in the Control Plane

These points strongly suggest the stale Funnel background config exists in Tailscale’s backend, not locally:

• Local machine state is empty

• LocalAPI confirms no config

• Control Plane UI uses old node IDs

• Funnel resets don’t work even on a clean device

• The machine receives a new node ID, but the same Funnel error persists

This creates a loop where Funnel can’t be enabled nor fully disabled.

5. What I’m Trying to Understand

How to force-clear Funnel/Serve background configuration from the tailnet

Whether this is a known issue with the new Funnel/Serve system

Whether stale Funnel entries can remain in the control plane even after device deletion

6. Environment

• OS: Ubuntu 24.04 (Noble)

• Tailscale: 1.90.6 (also reproduced on unstable 1.91.x)

• Fresh auth keys used

• Machine reinstalled multiple times

• Tailnet type: personal (email-based)

Despite repeated reinstalls and device resets, Funnel is stuck in a phantom “background configuration exists” state. The configuration doesn’t appear locally and survives new node IDs, suggesting it may be stuck in the tailnet control plane.

cc @Autotunafish @dismad @emersonian @decentralistdan

Do you already have a service listening on 443? You can run netstat -tuln | grep :443 and see.

I ran:

sudo netstat -tuln | grep :443

and nothing is currently listening on port 443.
So there’s no web server or other service binding that port — Tailscale should be able to use it.

(Separately, I confirmed with ss -tulpn | grep 443 as well, same result.)

1 Like

Have you attempted to reset or disable and re-enable? tailscale funnel command · Tailscale Docs
Funnel is technically beta and the difference with it and serve is that funnel exposes it to the public internet and so I can only imagine there would need to be some hackery required (presumably exposing a lightwallet server).

Yes. I’ve tried fully resetting and re-enabling Funnel and Serve several times, including all documented methods. I ran:
tailscale serve reset
tailscale funnel reset
tailscale funnel --https=443 off

I also wiped the local state file (/var/lib/tailscale/tailscaled.state), restarted tailscaled, reinstalled Tailscale, cleared ACL tags, and re-authenticated the node from scratch. Even with no serve config, Funnel still reports:
background configuration already exists

So yup, I did the full reset cycle, but the issue persists. It looks like Funnel has a stale config stuck in the control plane rather than locally.

Yeah I’m not sure, it’s curious. I found another issue that sounds the same, though their setup may be completely different. https://forums.unraid.net/topic/191745-tailscale-docker-integration-https-serve-not-functioning-as-expected-example-heimdall/

yea, looks similar. User is unable to expose services. I’ve posted this issue on unraid forum. I’m trying to track this issue down.

2 Likes

He didn’t get a solution yet but it seems to be the same issue you are facing. Probably worth asking there as he might have sorted it out but not updated the issue.

2 Likes

certainly the same issue and I see this is a persistent issue with Tailscale.

2 Likes

:lady_beetle: Tailscale Funnel: Stuck in Phantom Issue Solved

As stated in my first post, when attempting to enable Funnel, the node consistently returns a persistent, phantom error, regardless of local state cleanup.

  • tailscale funnel on always returns:

    background configuration already exists

  • Local reset fails:

    tailscale serve reset
    invalid localapi request

  • The error survives OS reinstalls, new Node IDs, and complete wiping of all local Tailscale state.

  • I attempted to contact Tailscale by submitting a support ticket, which received no reply.

So I rolled my sleeves and went to work….

Root Cause

The ServeConfig row is stuck in the control-plane database and can only be reliably cleared by:

  1. the same node-key calling /localapi/v0/serve reset, or

  2. Tailscale staff running an internal manual deletion .

If your binary is ≤ 1.78 (most distro packages) the local-api is disabled or broken, so option 1 is impossible.

The Fix that worked!

1. Nuke every old binary (they were ≤ 1.78)

Ensure all old Tailscale versions (especially those ≤ 1.78) are completely removed.

sudo systemctl stop tailscaled
sudo pkill -9 tailscaled
sudo rm -f /usr/sbin/tailscaled /usr/bin/tailscale /usr/local/bin/tailscale

2. Install current unstable (≥ 1.91) so I have working LocalAPI flags

Install a modern static binary (e.g., ≥ 1.91). This one took a while as I had to try different ones.

wget https://pkgs.tailscale.com/unstable/tailscale_1.91.65_amd64.tgz
sudo tar -C /usr/local -xzf tailscale_1.91.65_amd64.tgz
sudo cp /usr/local/tailscale_1.91.65_amd64/tailscale* /usr/bin/

3. Override systemd to use the new binary forever

Ensure the system uses the new binary consistently.

sudo tee /etc/systemd/system/tailscaled.service.d/binary.conf <<'EOF'
[Service]
ExecStart=
ExecStart=/usr/sbin/tailscaled --state=/var/lib/tailscale/tailscaled.state --socket=/run/tailscale/tailscaled.sock --port=41641
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now tailscaled

4. Create a fresh tailnet and delete the old one

  • Admin console → Settings → Delete tailnet (destroys every stuck row).
    Then, re-create the tailnet with the another account for a blank slate.

5. Generate a re-usable auth-key with the tag that bypasses the ghost row

  • Admin console → Keys → Generate auth-key → Advanced
    Add the tag: tag:server (disables key expiry and auto-tags the node).
    Copy the single tskey-auth-… string.
  • ACL requirement (must add before creating the key):

Ensure the tag is owned in the ACL:

"tagOwners": {
  "tag:server": ["ola_olu@example.com"]
}

On the server, authenticate with the new key:

sudo tailscale up --auth-key=<KEY>

6. Start your gRPC/JSON-RPC service (example: Zebra)

Start your application that will be exposed via Funnel.

cat > zebra.toml <<'EOF'
[network]
listen_addr = "0.0.0.0:8233"
initial_mainnet_peers = []

[rpc]
listen_addr = "0.0.0.0:8232"
debug_force_finished_sync = true
enable_cookie_auth = false
EOF
docker run -d --name zebra -p 8232:8232 -p 8233:8233 \
  -v "$PWD/zebra.toml":/home/zebra/.config/zebrad.toml \
  zfnd/zebra:latest

7. Enable Funnel (Success):rocket:

The Funnel command should now execute without error.

sudo tailscale funnel --bg --https=443 127.0.0.1:8232

8. Test from the internet (not server)

curl -X POST https://<node>.<tailnet>.ts.net:443 \
  -d '{"jsonrpc":"2.0","method":"getblockcount","id":1}' \
  -H 'Content-Type: application/json'
# → {"jsonrpc":"2.0","id":1,"result":258629}

What I learned from this stubborn issue

  1. “background configuration already exists” = control-plane row, not local files issue.

  2. If local-api is disabled, serve reset will most likely not work. So use auth-key on a fresh tailnet.

  3. Distro packages are ancient as the roman empire; rely on static/unstable installs for the latest features/fixes.

  4. systemd override keeps the new binary across reboots. No human uptime required so far as I can tell.

  5. Lastly, once the row is purged, Funnel will mostly likely work.

There’s a one-liner health check you could try though

curl -k http://127.0.0.1:<local-port> -d '{"jsonrpc":"2.0","method":"getblockcount","id":1}' -H 'Content-Type: application/json'

If that returns JSON, your tunnel is healthy. That means it’s pointing the world at https://<node>.<tailnet>.ts.net:443

Let me know if this works for you if you ever get stuck with a similar issue. And, if you tried something else, please feel free to share that solution as well!

4 Likes

I guess an interesting question is:

When will a fixed binary make it's way into mainstream package managers?
1 Like

Sync drop issue:

Zebra kept dying on mainnet sync, I sense it wasn’t the 4 GB RAM ceiling, but some peer handling issue. Once it rack up 200+ “ready” peers, it triggers a deterministic panic with exit code 139. Then container dies, RPC would stop responding, and syncing would restart from the last checkpoint…annoying!

I’m trying to see how to stabilize the node by setting a low peerset_initial_target_size, limited connections per IP, and increased the peer-crawl interval, well below the panic line in Zebra.toml. I’ve set this a few times in the last few days though with recurring panic and I’ve tried to resync again. I’ll keep watching how things go in coming hours.

1 Like

Solved: Zebra Mainnet Sync Failure

Over several days, Zebra node setup repeatedly crashed during initial mainnet synchronization. The failures were not caused by a bug, rather it was running on very low RAM (3.8 GB), which led to severe memory pressure, constant journald warnings (“Under memory pressure, flushing caches”), and random crashes inside Tokio runtime threads.

Secondly, the container was not writing the state database to disk. Zebra was not writing its state to /var/lib/zebra. Instead, it wrote to the default internal path /home/zebra/.cache/zebra, which was ephemeral inside the container. As a result, every restart wiped the state and forced the node to start syncing from height 0.

Because the disk state never persisted, Zebra repeatedly:

• re-downloaded large sync ranges (this annoyed me a bit),

• rebuilt validation caches from scratch,

• hit memory limits at mid-sync,

• crashed before checkpointing anything to disk.

Third, at ~28–30% sync, Zebra crashed with:

unexpectedly exceeded configured peer set connection limit:
peers: 201

This likely happened because Zebra discovered too many peers too quickly. With limited memory, many connections stayed in a “ready” state, overflowing the peerset and triggering a panic. Before the fix, Zebra continually printed:

creating new database with the current format

because the ephemeral cache directory never survived a restart, making Zebra think the database was brand new each time.

Fixes applied:

1. Corrected the Docker volume mount to persist the actual Zebra state directory:

/home/zebra/.cache/zebra/state/v27/mainnet

This single change allowed the state database to persist properly.

2. Increased swap to give Zebra more memory headroom.

We expanded swap from 8 GB → 15 GB, providing enough buffer during heavy checkpoint verification and preventing kernel kills of Tokio threads.

3. Adjusted sync concurrency for low-RAM environments by replacing the overly aggressive defaults with a safer profile:

[sync]
download_concurrency_limit = 10
checkpoint_verify_concurrency_limit = 200
full_verify_concurrency_limit = 5
parallel_cpu_threads = 1

4. Reduced peer discovery pressure:

peerset_initial_target_size = 50
crawl_new_peer_interval = “300s”
max_connections_per_ip = 1

After applying all fixes:

• Zebra successfully reached mainnet tip (~3,035,219 at the time of writing)

• No further crashes

• Sync throughput stabilized as caches warmed

• Disk state persisted correctly across restarts

• Peer behavior normalized

• Node has been running without any panics or resets

Did you experience something like this during your setup? If yes, what did you do? Share a comment.

1 Like