{"slug": "windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing", "title": "Windows-Copilot-API; Access GPT-4 and GPT-5 models without API keys or billing", "summary": "A new open-source project called Windows-Copilot-API allows developers to access GPT-4 and GPT-5 models through Microsoft Copilot without API keys or billing by turning the free Copilot web interface into an API. The tool provides a Python library and an OpenAI-compatible local server, requiring only a Microsoft or Google account sign-in. It is an unofficial project not affiliated with Microsoft.", "body_md": "**Using your own Microsoft Copilot account.** No API key, no credits, no paid plan: it turns the free chat at [copilot.microsoft.com](https://copilot.microsoft.com) into an API you can call from code.\n\nYou can use it in two ways:\n\n- 🐍\n**As a Python library:** just call`client.chat(\"Hi\")`\n\n. Supports streaming and multi-turn conversations. - 🔌\n**As a local OpenAI-compatible API:** runs a server at`http://localhost:8000/v1`\n\nthat speaks the OpenAI format, so the official`openai`\n\nSDK (and any OpenAI-compatible app) works as a drop-in, with`localhost`\n\nin place of OpenAI.\n\nYou sign in once in a browser with your Microsoft **or Google** account; your session is saved and refreshed automatically after that.\n\nUnofficial project.Not affiliated with or endorsed by Microsoft. It automates the consumer Copilot web experience for personal use, so use it responsibly and within Microsoft's terms.\n\n[Why use this?](#why-use-this)[Requirements](#requirements)[Setup (2 minutes)](#setup-2-minutes)[Run with Docker (optional)](#run-with-docker-optional)[Usage 1: In Python (no server)](#usage-1-in-python-no-server)[Usage 2: As an OpenAI-compatible server](#usage-2-as-an-openai-compatible-server)[Command line](#command-line)[Concurrency & stress test](#concurrency--stress-test)[Rate limiting](#rate-limiting)[Project layout](#project-layout)[Notes & limitations](#notes--limitations)[Troubleshooting](#troubleshooting)[Collaboration & support](#collaboration--support)[License](#license)[Star History](#star-history)\n\n**Free:** uses your normal signed-in Copilot, no API billing.**Drop-in OpenAI replacement:** point any OpenAI client at`localhost`\n\nand it just works.**Works everywhere you're signed in:** the signed-in path works even in regions where*anonymous*Copilot is blocked (e.g. India).**Streaming + conversations:** token-by-token output and multi-turn threads addressed by`conversation_id`\n\n.\n\n**Python 3.9+**- A\n**Microsoft account**(the free one you use for Copilot is fine) - Works on Windows, macOS, and Linux\n\n```\n# 1. Clone the project\ngit clone <your-repo-url>\ncd Windows-Copilot-API\n```\n\n**2. Create and activate a virtual environment**\n\nOn **macOS / Linux**:\n\n```\npython3 -m venv venv\nsource venv/bin/activate\n```\n\nOn **Windows** (PowerShell):\n\n```\npython -m venv venv\nvenv\\Scripts\\Activate.ps1\n```\n\nOn Windows you may need to allow script execution once:\n\n`Set-ExecutionPolicy -Scope CurrentUser RemoteSigned`\n\n. In`cmd.exe`\n\nactivate with`venv\\Scripts\\activate.bat`\n\ninstead.\n\n**3. Install dependencies and sign in**\n\n```\n# Install dependencies\npip install -r requirements.txt\n\n# Install the browser Playwright needs (one-time)\nplaywright install chromium\n\n# Sign in once: a browser opens, log into your Microsoft or Google account\npython -m copilot login\n```\n\nThe browser **closes by itself** once sign-in is detected — you don't need to press Enter or close it manually. After sign-in it sends one short warm-up message that mints the chat token **and** passes Cloudflare's \"verify you're human\" check in the same step (a brief \"finishing setup…\" appears, and a tiny throwaway chat lands in your history). If a checkbox shows up, click it in that login window. The steps are logged to `session/login.log`\n\nif anything goes wrong. That's it: your session is saved under `session/`\n\n(git-ignored, never shared) and reused on every run — so your first request works right away.\n\n🛠️\n\nRun into trouble during setup or your first run?Head to the[Troubleshooting]section, the bundled diagnostic bothfixescommon issues (captcha/clearance) andlogsa shareable report.\n\nPrefer a container? You can run the OpenAI-compatible server in Docker once you've signed in.\n\nSign in on the host first.The login step above opens avisiblebrowser, which can't run inside the headless container — so run`python -m copilot login`\n\non your host to populate`session/`\n\n. The container mounts that folder and reuses the Cloudflare clearance earned on the host. It refreshes the chat token headlessly, but it can't earnfreshclearance without a visible browser, so when clearance expires (~30 min) it returns a`503`\n\n— re-run`python -m copilot login`\n\non the host to refresh`session/`\n\n.\n\n``` php\ndocker compose up --build\n# -> Copilot OpenAI-compatible API on http://localhost:8000\n```\n\nThe [docker-compose.yml](/sums001/Windows-Copilot-API/blob/master/docker-compose.yml) maps port `8000`\n\nand bind-mounts your `session/`\n\nso the login persists across restarts. Tune `RATE_LIMIT_RPM`\n\n/ `RATE_LIMIT_BURST`\n\nthere. To run without Compose, build and pass the same bindings by hand:\n\n```\ndocker build -t windows-copilot-api .\ndocker run --rm -p 8000:8000 -v \"$(pwd)/session:/app/session\" windows-copilot-api\n```\n\nThe simplest way if your code is already Python.\n\n``` python\nfrom copilot import CopilotClient\n\nclient = CopilotClient()                 # loads your signed-in session\n\n# Get a full reply\nreply = client.chat(\"Say hello in one short sentence.\")\nprint(reply.text)\n\n# Continue the SAME conversation — pass the id back\nreply2 = client.chat(\"And now in French?\", reply.conversation_id)\nprint(reply2.text)\n\n# Stream the answer as it's typed\nfor chunk in client.stream(\"Tell me a short joke\"):\n    print(chunk, end=\"\", flush=True)\n```\n\n`chat()`\n\nreturns the full text plus a `conversation_id`\n\n; pass that id back to keep the thread going, or omit it to start fresh. `stream()`\n\nyields the reply piece by piece.\n\n👉 More: [examples/01_direct_chat.py](/sums001/Windows-Copilot-API/blob/master/examples/01_direct_chat.py), [02_direct_conversation.py](/sums001/Windows-Copilot-API/blob/master/examples/02_direct_conversation.py), [03_direct_stream.py](/sums001/Windows-Copilot-API/blob/master/examples/03_direct_stream.py)\n\nStart a local server that speaks the OpenAI API, so existing OpenAI tools and SDKs work unchanged.\n\n``` php\npython app.py\n# -> Copilot OpenAI-compatible API on http://127.0.0.1:8000\n```\n\nThen point any OpenAI client at it (the API key is required by the SDK but ignored):\n\n``` python\nfrom openai import OpenAI\n\nclient = OpenAI(base_url=\"http://localhost:8000/v1\", api_key=\"unused\")\n\nresp = client.chat.completions.create(\n    model=\"copilot\",\n    messages=[{\"role\": \"user\", \"content\": \"Hello!\"}],\n)\nprint(resp.choices[0].message.content)\n```\n\nOr call it with plain HTTP / `curl`\n\n:\n\n```\ncurl http://localhost:8000/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]}'\n```\n\n**Endpoints**\n\n| Method | Path | Description |\n|---|---|---|\n`POST` |\n`/v1/chat/completions` |\nChat (supports `\"stream\": true` and an optional `\"conversation_id\"` ) |\n`GET` |\n`/v1/models` |\nLists the single `copilot` model |\n\nChange the address with env vars:\n\n`HOST=0.0.0.0 PORT=8080 python app.py`\n\n, or run`uvicorn server.api:app --host 0.0.0.0 --port 8080`\n\n.\n\n👉 More: [examples/04_server_http.py](/sums001/Windows-Copilot-API/blob/master/examples/04_server_http.py), [05_server_stream.py](/sums001/Windows-Copilot-API/blob/master/examples/05_server_stream.py), [06_server_openai_sdk.py](/sums001/Windows-Copilot-API/blob/master/examples/06_server_openai_sdk.py)\n\n```\npython -m copilot login          # sign in and save the session\npython -m copilot ask \"Hello!\"   # quick one-shot question\n```\n\nCopilot's chat sits behind Cloudflare. Access needs a `cf_clearance`\n\ncookie,\nearned by passing a \"verify you're human\" check in a real browser, and it lasts\nabout half an hour. The bridge handles this for you:\n\n**At sign-in:**`python -m copilot login`\n\nearns clearance as part of the same warm-up that mints your token, so your first request works immediately. If Cloudflare shows a checkbox, click it in the login window.**When it expires:** if a later request hits the gate, the bridge opens a browser, passes the check (the checkbox is clicked automatically, or you click it if one appears), and retries the request for you. You'll see a short`[copilot] clearance: …`\n\nprogress log, then the answer.\n\nOn a trusted connection the check often passes invisibly with no window at all. A datacenter/VPN IP is stricter and more likely to show the checkbox; a residential connection clears most reliably.\n\nThe **server** never opens a window: when clearance expires it returns a `503`\n\n(`type: \"clearance_required\"`\n\n). Re-clear out of band with `python -m copilot login`\n\n, then retry.\n\nThe server bridges a **single** signed-in Copilot account, and Copilot's chat\nsocket doesn't tolerate concurrent conversations from one process. So the server\n**serializes** upstream calls: parallel HTTP requests queue behind a lock and run\none at a time (see [server/api.py](/sums001/Windows-Copilot-API/blob/master/server/api.py)). This is intentional, and it\nmeans throughput is sequential, not parallel.\n\nYou can measure where it breaks with the included stress test, which fires a\nbatch of simultaneous requests and **doubles the batch size every successful\nround** until the first error:\n\n```\n# Start the server in one terminal\npython app.py\n\n# Ramp concurrency in another (1 → 2 → 4 → 8 → …)\npython tests/stress.py\npython tests/stress.py --max 64 --timeout 120 --url http://localhost:8000\n```\n\n**Sample run** (one signed-in account):\n\n| Concurrency | Result | Wall time | Latency (min / median / max) |\n|---|---|---|---|\n| 1 | ✓ all ok | 3.7s | 3.7 / 3.7 / 3.7s |\n| 2 | ✓ all ok | 4.6s | 3.4 / 4.6 / 4.6s |\n| 4 | ✓ all ok | 8.3s | 3.7 / 6.7 / 8.3s |\n| 8 | ✗ 1 failed (`HTTP 502` ) |\n13.3s | 3.5 / 9.7 / 13.3s |\n\n**Highest fully-successful concurrency: 4.** Wall time roughly doubles each round\nwhile *minimum* latency stays flat (~3.5s) — the signature of a serialized queue:\none request runs immediately, the rest wait their turn. The failure at 8 is an\nupstream `502`\n\n(Copilot rejecting requests under load), not a server crash or\ntimeout — so the exact break point is flaky and may vary between runs.\n\nTakeaway: keep concurrent in-flight requests low (≈ 1–4). This is a personal bridge, not a high-throughput gateway — and please don't hammer your account.\n\nConcurrency (above) is *how many at once*; the **rate limit** is *how many per\nminute, sustained*. Microsoft publishes none for consumer Copilot, so the bridge\nenforces a self-imposed one with a [token bucket](/sums001/Windows-Copilot-API/blob/master/server/ratelimit.py): it caps\naccepted requests per minute and returns a standard `429`\n\n+ `Retry-After`\n\nwhen\nyou exceed it. Two env vars tune it:\n\n| Env var | Default | Meaning |\n|---|---|---|\n`RATE_LIMIT_RPM` |\n`12` |\nRequests/minute the bridge accepts. `0` disables the limit. |\n`RATE_LIMIT_BURST` |\n`4` |\nHow many requests may go back-to-back before pacing kicks in. |\n\n```\nRATE_LIMIT_RPM=20 RATE_LIMIT_BURST=5 python app.py   # raise it; 0 to disable\n```\n\nThe default 12 rpm sits safely below the ~15 rpm where a single account starts\nseeing upstream `502`\n\ns. To find *your* ceiling, run the server with the limiter\noff (`RATE_LIMIT_RPM=0`\n\n) and push the probe until failures appear:\n\n```\npython tests/ratelimit.py --rpm 20 --minutes 3\n```\n\n**On the client side, use exponential backoff.** Both `429`\n\n(bridge limit) and\nthe occasional `502`\n\n(Copilot upstream hiccup) are transient — retry with\ngrowing delays (e.g. 1s, 2s, 4s) and they almost always clear. The official\n`openai`\n\nSDK does this automatically and honours `Retry-After`\n\n; with plain HTTP,\nadd a few retries yourself.\n\n| Path | What it does |\n|---|---|\n|\n\n`CopilotClient`\n\n, auth, browser sign-in, HTTP driver[server/](/sums001/Windows-Copilot-API/blob/master/server)[examples/](/sums001/Windows-Copilot-API/blob/master/examples)[examples/README.md](/sums001/Windows-Copilot-API/blob/master/examples/README.md))[tests/](/sums001/Windows-Copilot-API/blob/master/tests)[tests/stress.py](/sums001/Windows-Copilot-API/blob/master/tests/stress.py)) and the diagnostic & report tool ([tests/diagnostic.py](/sums001/Windows-Copilot-API/blob/master/tests/diagnostic.py))[app.py](/sums001/Windows-Copilot-API/blob/master/app.py)**Sign in once, then reuse.** The cached token refreshes automatically; you only re-sign-in if the session fully expires.**No daily limit, but be reasonable.** Microsoft doesn't impose a daily chat cap, but please use it in moderation, and don't spam or hammer it with automated bulk requests.**One model.** Copilot has no model picker, so the server advertises a single model named`copilot`\n\n.**Roughly GPT-4 class.** On GPQA Diamond (198 graduate-level questions, closed-book) it scores**40.9%**, which puts it in the GPT-4 family rather than the reasoning tier (o1/o3). Measured with[tests/gpqa_bench.py](/sums001/Windows-Copilot-API/blob/master/tests/gpqa_bench.py).**Your session is private.** Everything in`session/`\n\n(cookies + token) stays on your machine and is git-ignored.\n\nCloudflare clearance is handled automatically (see above), so most \"verify you're human\" issues clear themselves. If a request still fails, run the diagnostic — it refreshes the session and writes a shareable report.\n\n```\npython tests/diagnostic.py                # browser capture + report\npython tests/diagnostic.py --report-only  # headless/VPS: report only, no browser\n```\n\nThe default run opens your signed-in browser and asks you to send one short message. That single action:\n\n**Refreshes clearance:** it drives a*real*browser on the same`session/profile/`\n\nthe bridge uses, so passing any \"verify you're human\" check earns a fresh`cf_clearance`\n\ncookie, then snapshots the session (cookies + token) into`session/token.json`\n\nfor the pure-HTTP driver to adopt.**Captures the protocol** to`session/ws_capture.log`\n\n. A clean turn goes`setOptions`\n\n→`send`\n\n→`appendText…`\n\n→`done`\n\n; a`{\"event\":\"challenge\", \"method\":\"cloudflare\",…}`\n\nframe means Cloudflare gated the turn.\n\nIt also writes `session/diagnostic_report.txt`\n\n— environment, the *shape* of your\nsession (cookie names + token length, never the values), a live chat probe, and\nredacted log tails. **Both files are safe to share:** access tokens, cookies,\nOAuth codes, and emails are redacted before anything is written. Attach\n`diagnostic_report.txt`\n\nto a GitHub issue (skim it first) and the cause is\nusually obvious.\n\nOn a headless\n\nserver/VPSyou can't open a browser, so clearance can't be earned there — pass`--report-only`\n\n, and do the clearance step on a machine with a display (or route traffic through a residential connection, e.g. a home-PC exit node), since datacenter IPs are where Cloudflare is strictest.\n\nNeed a hand getting this running? Open a [GitHub issue](/sums001/Windows-Copilot-API/issues) for bugs (for setup/auth problems, attach the redacted `diagnostic_report.txt`\n\nfrom `python tests/diagnostic.py`\n\n), start a [discussion](/sums001/Windows-Copilot-API/discussions) to share ideas, or send a pull request.\n\nAnd if you're working on something interesting, or looking for someone to build it, I'm always open to a chat. Feel free to reach out:\n\n- X:\n[@sums001](https://x.com/sums001) - Email:\n[devsum0101@gmail.com](mailto:devsum0101@gmail.com) - Discord:\n`sum_s_s`\n\nReleased under the [MIT License](/sums001/Windows-Copilot-API/blob/master/LICENSE). As this is an unofficial project, you remain responsible for complying with Microsoft's terms of service.", "url": "https://wpnews.pro/news/windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing", "canonical_source": "https://github.com/sums001/Windows-Copilot-API", "published_at": "2026-06-26 10:07:09+00:00", "updated_at": "2026-06-26 10:35:17.377277+00:00", "lang": "en", "topics": ["large-language-models", "ai-tools", "developer-tools"], "entities": ["Microsoft", "Copilot", "GPT-4", "GPT-5", "OpenAI"], "alternates": {"html": "https://wpnews.pro/news/windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing", "markdown": "https://wpnews.pro/news/windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing.md", "text": "https://wpnews.pro/news/windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing.txt", "jsonld": "https://wpnews.pro/news/windows-copilot-api-access-gpt-4-and-gpt-5-models-without-api-keys-or-billing.jsonld"}}