Your team just shipped a new personalization API. Marketing ran a campaign. Traffic spiked. Latency tripled. Dashboards lit up. Postmortem pain followed.
The root cause wasn’t a missing feature – it was missing evidence. No baseline, no capacity model, no load profile, no automated performance guardrails.
This week, you decide to change that. Enter Grafana k6 – an open source, developer-centric performance testing tool that feels like writing integration tests, but for traffic at scale.
In this guide, we’ll rebuild performance confidence step‑by‑step – the same way a real team would.
Why k6?
Before diving into code, let’s understand what makes k6 special:
| Need | k6 Solution |
|---|---|
| Developer-friendly | Write tests in JavaScript |
| Shift-left testing | Run locally, in CI, Docker, or k6 Cloud |
| Production realism | Model real traffic with scenarios, arrival-rate, ramping patterns |
| Built-in validation | Checks, thresholds, tags, custom metrics out of the box |
| Real-time observability | Native outputs to Prometheus, InfluxDB, Grafana |
| CI/CD ready | Deterministic exit codes based on SLO thresholds |
| Extensible | xk6 extensions for gRPC, Redis, Kafka, WebSockets, Browser testing |
The k6 Philosophy:
graph LR
A[Write Test<br/>JavaScript] --> B[Define Scenarios<br/>Traffic Patterns]
B --> C[Set Thresholds<br/>SLO Gates]
C --> D[Run Test<br/>Local or CI]
D --> E[Stream Metrics<br/>Grafana/Prometheus]
E --> F[Pass/Fail<br/>Based on SLOs]
style A fill:#e1f5ff
style C fill:#fff4e1
style F fill:#ffe1e1
First Script: Baseline Request
Create scripts/smoke.js:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import http from 'k6/http';
import { sleep, check } from 'k6';
export const options = {
vus: 5, // 5 virtual users
duration: '30s', // Run for 30 seconds
};
export default function () {
const res = http.get('https://api.example.com/health');
check(res, {
'status is 200': r => r.status === 200,
'response < 200ms': r => r.timings.duration < 200,
});
sleep(1);
}
Run it:
1
k6 run scripts/smoke.js
What You’ll See:
1
2
3
4
5
6
7
8
✓ status is 200
✓ response < 200ms
checks.........................: 100.00% ✓ 150 ✗ 0
http_req_duration..............: avg=145ms p(95)=178ms
http_reqs......................: 150 5/s
iterations.....................: 150 5/s
vus............................: 5 min=5 max=5
Understanding the Output:
- checks: Your validation rules (100% pass rate = healthy)
- http_req_duration: Response time stats (p95 is your friend)
- http_reqs: Total requests and requests per second
- iterations: How many times each VU completed the function
sequenceDiagram
participant K6
participant API
loop Every VU (5 times in parallel)
K6->>API: GET /health
API-->>K6: 200 OK (145ms)
K6->>K6: check(status == 200)
K6->>K6: check(duration < 200ms)
K6->>K6: sleep(1s)
end
Note over K6: Repeat for 30 seconds
Congratulations! You just ran your first load test. But we’re not stopping here—let’s add some teeth to it.
Add SLO Guardrails with Thresholds
Thresholds fail the test if SLOs degrade – perfect for CI.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
export const options = {
vus: 10,
duration: '1m',
thresholds: {
// Error rate must be < 1%
http_req_failed: ['rate<0.01'],
// P95 latency must be < 400ms AND average < 250ms
http_req_duration: ['p(95)<400', 'avg<250'],
// At least 99% of checks must pass
checks: ['rate>0.99'],
},
};
If violated, k6 exits with non‑zero status -> pipeline fails.
k6 Executors Explained
k6 provides several executor types to model different traffic patterns. Here’s a complete scenario combining multiple executors:
scenarios.js:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import http from 'k6/http';
import { check } from 'k6';
export const options = {
scenarios: {
ramp_up: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '1m', target: 50 },
{ duration: '2m', target: 50 },
{ duration: '30s', target: 0 },
],
exec: 'browse',
},
sustained_api: {
executor: 'constant-arrival-rate',
rate: 100, // requests per second
timeUnit: '1s',
duration: '3m',
preAllocatedVUs: 60,
maxVUs: 120,
exec: 'api',
},
spike: {
executor: 'per-vu-iterations',
vus: 100,
iterations: 1,
gracefulStop: '30s',
exec: 'login',
},
},
thresholds: {
'http_req_duration{type:api}': ['p(95)<500'],
'checks{scenario:login}': ['rate>0.99'],
},
};
export function browse () {
http.get('https://example.com/');
}
export function api () {
const res = http.get('https://api.example.com/v1/products');
check(res, { '200': r => r.status === 200 });
}
export function login () {
const payload = JSON.stringify({ user: 'test', pass: 'secret' });
const headers = { 'Content-Type': 'application/json' };
const res = http.post('https://api.example.com/login', payload, { headers, tags: { scenario: 'login' } });
check(res, { 'login ok': r => r.status === 200 });
}
Understanding the Three Scenarios
This test simulates three real-world traffic patterns running simultaneously:
| Scenario | What It Does | Why It Matters |
|---|---|---|
| ramp_up | 0→50→50→0 users over 3.5 minutes | Tests warm-up, cache loading, connection pools |
| sustained_api | 100 requests/second for 3 minutes | Validates steady-state capacity and SLA compliance |
| spike | 100 users login simultaneously | Tests auth system under sudden traffic burst |
k6 ramping-vus Executor
The ramping-vus executor controls the number of virtual users over time. Use it to simulate gradual user growth.
1
2
3
4
5
6
7
8
9
{
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '1m', target: 50 }, // Gradual increase to 50 users
{ duration: '2m', target: 50 }, // Hold steady at 50 users
{ duration: '30s', target: 0 }, // Ramp down to 0
],
}
| Setting | Description |
|---|---|
startVUs |
Initial number of virtual users |
stages |
Array of ramp stages with duration and target VU count |
gracefulRampDown |
Time to wait for iterations to finish when ramping down |
When to use: Morning traffic simulation, cache warm-up testing, gradual load increase.
Behavior: If your API slows down, users wait longer and RPS drops naturally—just like real users.
k6 constant-arrival-rate Executor
The constant-arrival-rate executor maintains a fixed requests-per-second (RPS) regardless of response time.
1
2
3
4
5
6
7
8
{
executor: 'constant-arrival-rate',
rate: 100, // 100 requests per second
timeUnit: '1s', // Rate is per second
duration: '3m', // Run for 3 minutes
preAllocatedVUs: 60, // Start with 60 VUs
maxVUs: 120, // Scale up to 120 if needed
}
| Setting | Description |
|---|---|
rate |
Number of iterations to start per timeUnit |
timeUnit |
Period of time to apply the rate (default: ‘1s’) |
preAllocatedVUs |
Initial VU pool size |
maxVUs |
Maximum VUs k6 can scale to |
When to use: SLA validation, capacity planning, “can we handle X RPS?” testing.
Behavior: If your API slows down, k6 spins up more VUs to maintain the target RPS. This reveals latency degradation under load.
k6 ramping-arrival-rate Executor
The ramping-arrival-rate executor gradually increases or decreases RPS through defined stages—combining the best of ramping patterns with fixed throughput guarantees.
1
2
3
4
5
6
7
8
9
10
11
12
13
{
executor: 'ramping-arrival-rate',
startRate: 10, // Start at 10 RPS
timeUnit: '1s',
preAllocatedVUs: 50,
maxVUs: 200,
stages: [
{ duration: '2m', target: 50 }, // Ramp to 50 RPS
{ duration: '3m', target: 100 }, // Ramp to 100 RPS
{ duration: '1m', target: 100 }, // Hold at 100 RPS
{ duration: '1m', target: 0 }, // Ramp down
],
}
| Setting | Description |
|---|---|
startRate |
Initial iterations per timeUnit |
stages |
Array of stages with duration and target RPS |
preAllocatedVUs |
Initial VU pool (set to expected RPS / iterations per VU per second) |
maxVUs |
Maximum VUs for scaling (set 2-3x preAllocatedVUs) |
When to use: Black Friday traffic simulation, realistic ramp-up to peak load, gradual stress testing.
Behavior: k6 dynamically adjusts VUs to maintain the target RPS at each stage. Perfect for finding the breaking point.
Complete Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import http from 'k6/http';
import { check } from 'k6';
export const options = {
scenarios: {
ramp_to_peak: {
executor: 'ramping-arrival-rate',
startRate: 10,
timeUnit: '1s',
preAllocatedVUs: 50,
maxVUs: 300,
stages: [
{ duration: '1m', target: 50 }, // Warm up to 50 RPS
{ duration: '2m', target: 100 }, // Scale to 100 RPS
{ duration: '3m', target: 200 }, // Push to 200 RPS
{ duration: '2m', target: 200 }, // Hold peak
{ duration: '1m', target: 0 }, // Ramp down
],
},
},
thresholds: {
http_req_duration: ['p(95)<500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get('https://api.example.com/products');
check(res, { 'status 200': (r) => r.status === 200 });
}
k6 per-vu-iterations Executor (Spike Testing)
The per-vu-iterations executor runs each VU for a fixed number of iterations—perfect for instant traffic bursts.
1
2
3
4
5
6
{
executor: 'per-vu-iterations',
vus: 100, // 100 users
iterations: 1, // Each runs once = instant burst
gracefulStop: '30s',
}
When to use: Login storms, push notification spikes, flash sale simulations.
ramping-vus vs ramping-arrival-rate: Which Should You Use?
| Executor | Controls | When Latency Increases… | Best For |
|---|---|---|---|
| ramping-vus | Number of concurrent users | RPS decreases naturally | Simulating realistic user behavior |
| constant-arrival-rate | Fixed RPS | k6 adds more VUs | SLA validation at specific throughput |
| ramping-arrival-rate | Gradually changing RPS | k6 adjusts VUs dynamically | Realistic ramp to peak traffic |
Quick Decision Guide:
- “How does my app handle growing users?” → Use
ramping-vus - “Can we sustain exactly 500 RPS?” → Use
constant-arrival-rate - “Simulate Black Friday traffic ramp” → Use
ramping-arrival-rate - “What happens in a login storm?” → Use
per-vu-iterations
Pro Tip: Start with one scenario, validate it works, then add more. Don’t try to build the perfect test on day one.
Data-Driven & Parameterized Tests
data.js:
1
2
3
4
5
6
7
8
9
10
11
12
13
import http from 'k6/http';
import { check } from 'k6';
import { SharedArray } from 'k6/data';
const users = new SharedArray('users', () => JSON.parse(open('./users.json')));
export const options = { vus: 20, duration: '45s' };
export default function () {
const user = users[__ITER % users.length];
const res = http.get(`https://api.example.com/users/${user.id}`);
check(res, { 'user fetched': r => r.status === 200 });
}
users.json example:
1
2
3
4
5
[
{ "id": 101 },
{ "id": 102 },
{ "id": 103 }
]
Use environment variables for secrets:
1
2
3
API_BASE=https://api.example.com \
TOKEN=$(op read op://secrets/api_token) \
k6 run data.js
In script:
1
const BASE = __ENV.API_BASE;
Checks vs Thresholds vs Assertions
| Concept | Scope | Purpose |
|---|---|---|
| check() | Per response | Functional validation; contributes to checks metric |
| Threshold | Aggregated metric | Enforce SLOs; fails run on breach |
| Custom logic (throw) | Immediate | Hard stop for critical scenarios |
Custom & Trend Metrics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import { Trend, Counter } from 'k6/metrics';
const queueDelay = new Trend('queue_delay_ms');
const authFailures = new Counter('auth_failures');
export default function () {
const start = Date.now();
// simulate internal queue wait
sleep(Math.random() * 0.05);
queueDelay.add(Date.now() - start);
const res = http.get('https://api.example.com/auth/ping');
if (res.status !== 200) authFailures.add(1);
}
Visualize custom metrics in Grafana (Prometheus / Influx pipeline described next).
Observability: Streaming to Grafana
Option A: Prometheus Remote Write
Run k6 with output:
1
2
3
k6 run --out experimental-prometheus-rw --tag test=baseline scenarios.js \
--address 0.0.0.0:6565 \
-e API_BASE=https://api.example.com
Point to your Prometheus remote-write endpoint via env vars (K6_PROMETHEUS_RW_SERVER_URL).
Option B: InfluxDB + Grafana
1
2
3
4
5
docker run -d --name influx -p 8086:8086 influxdb:2
export K6_INFLUXDB_ORGANIZATION=myorg
export K6_INFLUXDB_BUCKET=perf
export K6_INFLUXDB_TOKEN=secret
k6 run --out influxdb=http://localhost:8086 scenarios.js
Import a k6 Grafana dashboard (ID: 2587 or community variants) and correlate with application metrics.
Correlation Workflow
- Run k6 scenario tagged with
scenario,type. - Use Grafana to filter:
http_req_duration{scenario="sustained_api"} - Overlay with service latency & DB CPU.
- Pin p95 regression panels.
GitHub Actions CI Integration
.github/workflows/perf.yml:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
name: performance
on:
pull_request:
paths: ['api/**']
workflow_dispatch: {}
jobs:
k6:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install k6
uses: grafana/setup-k6-action@v1
- name: Run smoke performance test
run: k6 run scripts/smoke.js
- name: Run gated scenario test
run: |
k6 run scenarios.js || echo "Performance regression detected" && exit 1
Failing thresholds stop merges: true shift‑left.
Scaling Beyond One Machine
| Need | Approach |
|---|---|
| Higher concurrency | k6 Cloud (managed scaling) |
| Kubernetes-native | k6 Operator (CRDs define tests) |
| Protocol diversity | xk6 extensions (Kafka, Redis, Browser) |
| Browser + API mix | k6 Browser module (Chromium drive) |
k6 Operator Example (CRD excerpt):
1
2
3
4
5
6
7
8
9
10
apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: api-load
spec:
parallelism: 4
script:
configMap:
name: k6-script
file: scenarios.js
Best Practices Checklist
- Start with small smoke tests on every PR.
- Define SLO-aligned thresholds (p95, error rate) early.
- Model realistic traffic mix (browsing vs API vs auth spike).
- Keep scripts versioned alongside code; treat as test artifacts.
- Tag everything: scenario, endpoint, version, commit SHA.
- Stream to Grafana; correlate with infra + APM traces.
- Use arrival-rate executors for RPS targets (not just VUs).
- Add custom business metrics (e.g.,
orders_per_second). - Run capacity tests before major launches.
- Automate regression gates in CI.
- Periodically refresh test data to avoid caching distortion.
Resources
- Official k6 Documentation
- k6 Examples Repository
- xk6 Extensions Hub
- k6 Kubernetes Operator
- k6 Browser Module Guide
- Prometheus Remote Write Output Guide
- Grafana k6 Blog Articles
- Performance Test Thresholds Best Practices
Final Thought: Performance isn’t a phase, it’s a habit. k6 lets you encode that habit as code, observable data, and enforceable standards.
If you found this helpful, share it or adapt the snippets to your stack. Have a twist on these patterns? Drop a comment.