How I Built the Senior Xray VPN Architect AI Skill

Introduction

When you work with narrow technical domains, one thing becomes obvious very quickly: telling a general LLM to "act like an expert" is not enough. The model may sound confident while mixing old practices with current ones, missing version constraints, ignoring client limitations, or failing to reason about DNS, routing, threat models, and real network conditions.

That problem is especially visible with Xray-core, VLESS, REALITY, XHTTP, split tunneling, and client VPN profiles. This is not a domain where "use this config, it is the best option" is a reliable answer. A setup that works for one client-server combination can break on another Xray-core version, another GUI client, another provider, or another threat model.

So I decided to build more than a prompt. I built a dedicated AI skill: Senior Xray VPN Architect. Its purpose is not to generate random configurations. Its job is to behave like an engineering assistant: clarify context, design architecture, review JSON, distinguish client-side and server-side routing, warn about risk, and give steps that can be checked.

What Senior Xray VPN Architect Is

Senior Xray VPN Architect is a domain-specific AI skill for authorized private VPN/proxy infrastructure built around Xray-core. It is meant for architecture, review, troubleshooting, migration, and operations work involving VLESS, REALITY, XHTTP, TLS, XTLS Vision, DNS routing, split tunneling, WARP outbound, CDN/reverse proxy setups, and different client applications.

It is important to draw the boundary clearly: this is not a "bypass everything" bot and it is not a collection of copy-paste tutorials. I designed it as a defensive private-network tool: for owned servers, owned clients, legitimate administration, privacy, and resilient connectivity.

The system has several layers:

the main behavior file: SKILL.md;
the architecture selection framework: decision-framework.md;
vetted architecture patterns: vetted-topologies.md;
XHTTP and REALITY reference notes: xhttp-reality.md;
DNS, routing, and split tunnel notes: dns-routing.md;
client compatibility notes: client-compatibility.md;
operations and monitoring notes: operations-monitoring.md;
primary sources for freshness checks: current-sources.md;
response templates: architecture-brief.template.md and config-review.template.md;
a Python linter for Xray JSON configs: lint_xray_config.py;
the skill interface descriptor: openai.yaml.

That structure is what makes the skill different from a long prompt. It has a role, rules, a knowledge base, work modes, templates, deterministic checks, and freshness requirements.

Why a System Prompt Is Not Enough

The first temptation is to write a large system prompt: "you are a senior architect, answer professionally." That is not enough.

Complex infrastructure is not described by one rule. A useful answer has to consider:

the Xray-core version;
the exact client and backend;
the import method: URI, subscription, raw JSON, or panel profile;
the server topology;
the DNS model;
routing policy;
threat model;
domain, CDN, and reverse proxy constraints;
acceptable operational complexity;
fallback and rollback.

If all of that is stuffed into a single prompt, the prompt turns into a long, brittle pile of instructions. I split the knowledge into separate reference files instead. The main SKILL.md sets behavior, and the reference files act as engineering memory.

Safety Boundaries as a Required Layer

Safety is a separate required layer. The skill should help only in authorized contexts: owned servers, owned devices, private infrastructure, defensive administration, configuration audits, and reliability work.

SKILL.md explicitly says that the skill should not help with hidden access, credential theft, attacks, botting, mass abuse, destructive scanning, or bypassing restrictions where the user has no authorization.

Another important rule is secret handling. The skill is instructed not to reveal UUIDs, private keys, WARP private keys, short IDs, session keys, tokens, subscription URLs, panel credentials, or sensitive IP addresses.

That matters not only while using the agent, but also when publishing the skill files. If I attach files to an article or a repository, they must be sanitized: no keys, tokens, real servers, temporary S3 links, presigned URLs, domains, or any data that could reconstruct a working infrastructure setup.

Work Modes: Different Tasks Need Different Answers

The same Xray question can mean very different things. A user may want to:

design a new architecture;
review an existing JSON config;
understand why a connection fails;
migrate from one scheme to another;
set up monitoring;
learn a concept without deploying anything.

So SKILL.md defines several modes:

architecture - topology design;
config-review - review of an existing config;
troubleshooting - problem diagnosis;
migration - transition from one scheme to another;
monitoring - operations, checks, and rotations;
explanation - concept explanation without a deployable setup.

This simple separation improves answer quality. The agent stops answering with generic theory and starts working in the correct mode.

For example, if the user brings JSON, the skill should not begin with a general explanation of VLESS. It should run a config-review: identify P0/P1/P2 issues, explain where the problem is, why it is a risk, how to fix it, and how to verify the result. That is where config-review.template.md is used.

For architecture tasks, the output format is different: short conclusion, recommended topology, requirements, DNS/routing, client compatibility, monitoring, risks, and fallback. That structure is defined in architecture-brief.template.md.

Decision Framework: How the Skill Chooses an Architecture

One of the most important files is decision-framework.md. Its job is to prevent the model from choosing transports by fashion.

In VPN/proxy infrastructure, it is easy to drift into slogans:

"VLESS is always best";
"REALITY solves everything";
"XHTTP is the most resilient";
"CDN hides everything";
"latest means correct".

Real architecture does not work that way. Every decision must be tied to constraints.

The decision framework forces the skill to inspect:

user goal: full tunnel, split tunnel, target services, low latency, resilience, maintainability;
threat model: passive observation, active probing, DNS leakage, provider blocking, target-service VPS blocking;
environment: country, ISP, VPS location, CDN, domain;
client devices and GUI support;
cost, complexity, updates, monitoring, and fallback.

After that, every candidate topology must be described with:

fit;
required versions and client support;
expected latency and overhead;
detection and failure surface;
operational burden;
fallback path.

That turns the answer from "install X" into an engineering comparison of trade-offs.

Vetted Topologies: Patterns, Not Universal Recipes

vetted-topologies.md contains a set of architecture patterns. These are not recipes to apply blindly. They are patterns selected according to constraints.

Direct VLESS + REALITY + TCP/Vision

This is a simple option for personal infrastructure with a small number of moving parts. It can be a good choice when the client supports REALITY/Vision, the server IP is reachable from the required networks, and the user does not need a CDN or origin shielding.

Its limitations are also clear: the VPS IP is visible to the client network, the cover/serverName strategy may change, and GUI clients may import parameters incorrectly.

VLESS + XHTTP + REALITY

This scheme is useful only when XHTTP properties fit the actual task. It is not better merely because it is newer. The skill must check the Xray-core version, client version, XHTTP mode, and import method.

The combination XHTTP + REALITY + auto can be version-sensitive, which is why it is covered separately in xhttp-reality.md.

CDN XHTTP + TLS

This option makes sense when the user has a domain, a CDN/reverse proxy, and the willingness to maintain certificates, DNS, origin hardening, and CDN constraints.

But CDN is not magic. A CDN can change behavior, break streaming assumptions, add latency, or expose the origin if configured poorly.

WARP Outbound

WARP outbound is useful not as a universal anti-blocking tool, but as a separate egress path for services that dislike VPS IP addresses. This is a server-side routing task: the traffic has already reached the VPN server, and the server chooses how to send it out.

Cascade Entry + Exit

A cascade can separate the visible entry node from the exit node, but it increases latency, complexity, and the number of failure points.

Split-Tunnel RU/Direct + Foreign Proxy

This pattern is useful when local, banking, government, regional, or geography-sensitive services should go direct while the rest of the traffic goes through the proxy. The key is to understand where the routing policy is executed.

The Key Point: Split Tunneling Belongs on the Client Side

One of the most important architectural conclusions I added is this: real user split tunneling has to be solved on the client side or on a local gateway, not only on the VPN server.

Why?

The server sees only the traffic that has already entered the tunnel. If a request has already gone to the VPN server, the server can choose egress: freedom, WARP, cascade, or another proxy outbound. But the server can no longer send traffic "direct through the user's local ISP", because local direct routing must happen before the traffic enters the tunnel.

The correct separation is:

Client-side / gateway-side routing:
  local/private/RU domains/IPs -> direct through the user's local network
  selected foreign/default traffic -> proxy/VPN
  selected blocked/noisy traffic -> block

Server-side routing:
  traffic already inside the VPN tunnel -> freedom / WARP / cascade / another outbound

That means the skill must be able not only to provide server configuration, but also to generate client JSON with split tunnel rules when the client supports that format.

Example client-profile logic:

{
  "outbounds": [
    {
      "tag": "proxy",
      "protocol": "vless",
      "settings": {
        "vnext": [
          {
            "address": "<REDACTED_SERVER_DOMAIN>",
            "port": 443,
            "users": [
              {
                "id": "<REDACTED_UUID>",
                "encryption": "none"
              }
            ]
          }
        ]
      }
    },
    {
      "tag": "direct",
      "protocol": "freedom"
    },
    {
      "tag": "block",
      "protocol": "blackhole"
    }
  ],
  "routing": {
    "domainStrategy": "IPIfNonMatch",
    "rules": [
      {
        "type": "field",
        "ip": ["geoip:private"],
        "outboundTag": "direct"
      },
      {
        "type": "field",
        "domain": ["domain:ru", "domain:su", "domain:xn--p1ai"],
        "outboundTag": "direct"
      },
      {
        "type": "field",
        "outboundTag": "proxy"
      }
    ]
  }
}

This is not a production config. It is an illustration of the principle. In a real task, the skill must ask for the client, version, core/backend, import method, DNS model, geosite/geoip dataset, and whether the client preserves custom routing rules.

This logic is connected to both dns-routing.md and client-compatibility.md.

Why Client Compatibility Is Critical

A common mistake is assuming that if Xray-core supports a feature, every client will support it too. That is false.

client-compatibility.md states the principle directly: server-side Xray support does not guarantee client UI/import support.

Failure modes include:

a GUI client silently drops unknown fields;
URI import brings only connection parameters, not routing;
a subscription URL does not carry a full DNS/routing policy;
the client uses an old embedded core;
the client supports REALITY but not the required flow;
XHTTP exists on the server but not in the client UI;
NekoBox/NekoRay may use a different backend or translate the format;
v2rayNG may import a connection URI while routing remains a separate setting;
v2rayA/gateway scenarios depend on TUN, DNS, and router rules.

So the skill has to ask:

exact client name;
platform;
version;
core/backend;
import method;
whether REALITY parameters are supported;
whether the XHTTP mode is supported;
whether DNS/routing rules are preserved;
whether raw JSON mode exists.

Only then can the skill say where the split tunnel should live: client, local gateway, server, or a hybrid model.

DNS and Routing as a Separate Architecture Layer

A VPN config is not only inbound and outbound. DNS and routing are often more important than transport selection.

dns-routing.md captures several principles:

DNS outside the intended path can reveal domains;
browser, OS, client, and Xray may resolve names differently;
the architecture must define who resolves which domains and through which outbound;
routing rules should move from specific to broad;
the final catch-all rule should be explicit;
.ru, .su, .рф, and punycode need separate attention;
geosite/geoip datasets are not always fresh or identical across clients.

This is especially important for split tunneling. If the user wants local and Russian services to go direct while the rest goes through VPN, it is not enough to add a few domain rules. The skill has to check DNS path, direct/proxy path, private ranges, punycode, geosite/geoip, and actual client behavior.

That is why the skill should not answer "paste these rules." It should answer: "this is the model, this is where it runs, and this is how to verify it."

XHTTP and REALITY: Not Magic, but a Version-Sensitive Layer

XHTTP and REALITY are one reason the skill needs a freshness rule. These technologies cannot be treated as timeless recipes.

xhttp-reality.md notes that XHTTP grew out of SplitHTTP, and that common user-facing modes include auto, packet-up, stream-up, and stream-one. Their behavior depends on version. Old SplitHTTP snippets should not be reused without checking current xhttpSettings.

For XHTTP + REALITY troubleshooting, the skill should ask:

exact Xray-core version;
whether downloadSettings is present;
effective HTTP path: H1, H2, or H3;
whether the client GUI supports the required mode and fields;
how Browser Dialer behaves;
whether CDN/reverse proxy/TLS/ALPN affects behavior.

Another principle: REALITY should not be described as "fully undetectable." The skill should talk about detection surface, active probing resistance, operational fragility, cover domain strategy, and fallback. It should not promise absolute resilience.

Source Freshness: When the Skill Must Not Answer from Memory

current-sources.md lists primary sources the skill should check for fresh or disputed questions:

Xray-core releases;
the Xray-core repository;
Project X documentation;
XHTTP: Beyond REALITY discussion;
Xray examples;
the REALITY repository;
transport config source;
XHTTP/SplitHTTP transport code;
client-specific docs and release notes.

This matters because documentation can lag behind release notes and source code. That is especially true for XHTTP, Browser Dialer, H2/H3, QUIC/finalmask, REALITY interaction, and GUI client support.

The general rule is simple: if the user asks "latest", "current", "2026", "does it work now?", "does this client support it?", or "how does this XHTTP mode behave?", the skill should not answer confidently from old memory. It should check primary sources.

The Linter as a Deterministic Check

One of the most useful components is lint_xray_config.py.

An LLM can inspect JSON and still miss obvious issues: a missing outbound, a wrong outboundTag, an implicit domainStrategy, debug logs, empty DNS servers, or a subtle XHTTP + REALITY + auto combination.

The linter does not replace a full manual review and it does not validate the entire Xray schema. But it catches basic architecture and hygiene signals:

JSON must parse;
the root must be a JSON object;
inbounds must exist;
outbounds must exist;
log.loglevel should not be debug in production;
XHTTP/SplitHTTP should be detected;
XHTTP + REALITY + auto requires version-specific review;
REALITY settings should be checked for basic fields;
DNS servers should be intentional;
routing rules should exist;
outboundTag must point to an existing outbound;
domainStrategy should be explicit;
the final routing rule should have understandable behavior.

So the skill gets not only reasoning, but also a small deterministic verification tool. That reduces hallucination risk and makes config review more reproducible.

Response Templates: Predictable Output Instead of Freeform Reasoning

Even when a model knows the topic, it can forget an important layer. It may describe transport well but forget DNS. It may propose a topology but omit fallback. It may find a bug but fail to explain verification.

That is why the skill includes templates:

architecture-brief.template.md for design tasks;
config-review.template.md for config reviews.

An architecture answer should include:

context;
threat model;
recommended topology;
DNS and routing;
client compatibility;
monitoring and fallback;
open questions.

A config-review answer should include:

scope;
findings;
checks;
residual risks.

The format forces the agent to reason in layers instead of producing an unstructured stream of advice.

Operations and Monitoring: Life After Deployment

VPN/proxy infrastructure does not end with the first successful connection. It has to be updated, monitored, documented, and rolled back when needed.

operations-monitoring.md describes Day-2 operations:

keep a known-good rollback config;
do not keep debug logs in production;
protect panel/admin endpoints;
document the topology;
check server reachability;
check handshake/connect success;
measure latency from different networks;
verify DNS behavior;
verify direct/proxy routing;
monitor target-service reachability;
watch error rate, restart count, and traffic anomalies;
read release notes before major updates;
test updates on one client/profile first;
keep the old binary/config for rollback.

This turns the skill from a setup-time helper into an operations assistant.

A Practical Scenario

Imagine this request:

> I want a personal Xray VPN. Russian and local services should go direct, everything else should go through VPN. The client is v2rayN or NekoBox. Some services that dislike VPS IP addresses may need to exit through WARP.

A generic answer might immediately propose server routing. Senior Xray VPN Architect should do something else.

First it identifies the mode: architecture or migration.

Then it asks:

Xray-core version;
client and version;
platform;
raw JSON or subscription;
whether TUN is needed;
where the routing policy should run;
which domains go direct;
which domains go proxy;
whether WARP is needed as server-side egress;
what DNS model is acceptable.

Then it separates the architecture:

Client-side:
  local/private/RU -> direct
  foreign/default -> proxy
  optional blocklist -> block

Server-side:
  accepted VPN traffic -> freedom / WARP / cascade / fallback

After that, the skill can provide sanitized client JSON, server-side topology, DNS/routing checks, compatibility risks, and a rollback plan.

The key point: it should not produce a "magic config." It should explain where each rule runs and how to verify that it actually works.

Preparing Files for Publication

Because the article may refer to skill files, those files must be cleaned before publication.

Before publishing, remove or replace:

real UUIDs;
private keys;
public keys if tied to working infrastructure;
WARP private keys;
short IDs;
session keys;
API keys;
tokens;
subscription URLs;
panel credentials;
SSH hostnames;
SSH usernames if they reveal infrastructure;
real VPS IP addresses;
real domains;
REALITY serverNames/cover domains if sensitive;
presigned/S3 links;
temporary links containing AWSAccessKeyId, Signature, token, or Expires;
.env;
cookies;
logs containing IPs, UUIDs, emails, user IDs, or tokens;
any working config that can be imported and used unchanged.

Useful placeholders:

<REDACTED_UUID>
<REDACTED_PRIVATE_KEY>
<REDACTED_PUBLIC_KEY>
<REDACTED_SHORT_ID>
<REDACTED_SERVER_IP>
<REDACTED_DOMAIN>
<REDACTED_SUBSCRIPTION_URL>
<REDACTED_PANEL_PASSWORD>
<EXAMPLE_XRAY_VERSION>
<EXAMPLE_CLIENT_VERSION>

Markdown, JSON, YAML, Python files, comments, commands, filenames, archive metadata, and git history should all be checked if the skill is published as a repository.

A useful publication note is:

> All attached files are sanitized versions of the skill system. Real keys, UUIDs, IP addresses, domains, subscription URLs, tokens, presigned links, and any data that could reconstruct working infrastructure have been removed.

What I Got in the End

The result is not a "VPN bot." It is a small engineering system around an LLM.

Senior Xray VPN Architect can:

design topologies;
review JSON configs;
troubleshoot failures;
plan migrations;
explain concepts;
account for client-side split tunneling;
generate client JSON when the client supports it;
separate client routing, gateway routing, and server routing;
check DNS/routing logic;
account for client compatibility;
provide monitoring and fallback;
use primary sources for fresh questions;
avoid absolute promises.

My main conclusion is this: a good AI skill is not a long prompt. It is a system made of role, safety boundaries, work modes, reference files, templates, checks, a linter, and freshness rules.

That structure helps an LLM stop behaving like a generator of confident advice and become a useful technical assistant that thinks in architectural trade-offs.