April 16, 2026

How I Test 13 SDKs Against a Live API with One GitHub Actions Workflow

api-alertssdkgithub-actionstestingci

I maintain 13 SDKs for API Alerts. JavaScript, Python, Kotlin, Swift, Go, Rust, PHP, C#, Ruby, Dart, Godot, a CLI, and a GitHub Actions integration. Most of them were generated with AI assistance, which means I needed a way to actually trust them before shipping them to users.

The solution I landed on is a single repo, apialerts/integration-tests, that checks out every SDK, runs its sample script against the real production API, and confirms the event actually arrives at my backend. No mocks. No stubs. The real thing.

Here’s how it works.

The problem with AI-generated SDKs

When you use Claude Code or any other AI tool to generate an SDK, you get code that looks correct. The structure is right, the error handling is there, the types make sense. But “looks correct” and “works against a real API” are two different things.

I needed to know that every SDK could:

  1. Authenticate with a real API key
  2. Send a minimal event (just a message, everything else optional)
  3. Send a full event (all fields populated)
  4. Handle the response without breaking

And I needed to know this across 13 different languages and runtimes, before every release.

The architecture

Each SDK repo has a sample/ directory with a script that does exactly those two things. It sends a minimal event and a full event using a dedicated test workspace API key.

The integration-tests repo contains nothing except a GitHub Actions workflow. No code lives there. All the actual test logic lives in the SDK repos themselves. The workflow just:

  1. Checks out the SDK repo
  2. Sets up the required runtime (Node, Python, JVM, Swift toolchain, etc.)
  3. Runs the sample script with the test API key injected as a secret
  4. The event either arrives at my backend or it doesn’t

The Go SDK (apialerts-go/sample/github.go) is the reference implementation I used as the pattern for all the others.

What gets tested

Each sample sends two events.

Minimal. Only a message field. This validates that all the other fields are genuinely optional and the SDK doesn’t blow up when they’re omitted.

Full. All fields: message, channel, event, title, tags, link, and data. This validates that the full payload serialises correctly and the API accepts it.

If both events arrive at my test workspace, the SDK is working. If they don’t, something is broken and I know before it ships.

The trigger

The workflow is manual only, using workflow_dispatch. I run it before any coordinated release across the SDKs. It’s not running on every commit because that would hammer the production API unnecessarily and burn through rate limits. It’s a deliberate “I’m about to ship, let me verify everything works” gate.

Why this approach

A few things I deliberately chose here.

Real production API, not a staging environment. If something only breaks in production, a staging test won’t catch it. I want to know the actual endpoint the actual SDK is hitting actually works.

Sample scripts live in the SDK repos. This means the test is always in sync with the SDK. If the SDK changes its interface, the sample has to change too, or it won’t compile. The integration test repo never goes stale because it just runs whatever the SDK says to run.

One repo, one workflow. I could have added integration test jobs to each individual SDK repo, but that would mean duplicating the test API key secret across 13 repos and having 13 separate places to check if something is failing. Centralising it means one place to look.

The status table

The README has a simple status table showing which SDKs have samples ready and which jobs are active. Right now everything is green.

SDKSample readyJob active
C#
CLI
Dart
Go
Godot
JS
Kotlin
Notify Action
PHP
Python
Ruby
Rust
Swift

That table is also useful for me psychologically. When every row is green before a release, I can ship with confidence instead of hoping nothing is broken.

What it doesn’t test

To be clear about the limitations, this is an integration smoke test, not a comprehensive test suite. It doesn’t test:

  • Error handling (what happens when you send a bad API key, or a malformed payload)
  • Edge cases in the routing logic
  • Rate limiting behaviour
  • SDK-specific language idioms or whether the code is idiomatic for that language

Those are things I rely on code review, user feedback, and the private beta to surface. The integration test just answers one question: does it work at all?

Adding a new SDK

When I add a new SDK the process is simple:

  1. Add a sample/ directory to the SDK repo with a minimal and full event script
  2. Uncomment the corresponding job in the integration workflow
  3. Run it

That’s it. The workflow handles the rest.

Why I’m writing this

API Alerts 2.0 is coming out soon. It’s a significant overhaul with event routing, multiple destinations, glob pattern filtering, and a bunch of new integrations. Having 13 SDKs that I can verify against production in a single workflow run, before I flip the switch, is the thing that’s letting me ship with confidence as a solo developer.

If you’re building a multi-SDK product and trying to figure out how to test it without a QA team, this approach has worked well for me. The repo is public if you want to look at how the workflow is structured: github.com/apialerts/integration-tests.