This tool is the operational companion to the IFZA Panama Strategic Growth Plan. It turns the A/B testing strategy from the deck into a living workflow your team uses weekly.
The dashboard shows all tests at a glance. We've pre-loaded 10 tests from the strategy deck, covering all four layers (Channel, Audience, Message, Creative). Use the status dropdown on each row to move tests from Planned → Active → Complete as your team progresses through them.
Use the filter buttons to focus on active tests during weekly check-ins, or completed tests during monthly reviews.
Go to Test Planner and click + New Test. For every test, define:
• Hypothesis — What do you expect to happen and why? This prevents post-hoc rationalization.
• Variants A & B — The control and challenger. Be specific: "Google Search high-intent keywords" not just "Google."
• Success Metric — One primary metric. Not three. If you can't pick one, the test isn't focused enough.
• Min. Sample Size — How many impressions/clicks/leads before you can read the result. Low-volume B2B needs at least 300–500 per variant for upstream metrics (CTR, CPL).
• Impact & Effort — Scores from 1–10. These feed the Priority Matrix automatically.
When a test reaches its sample size, go to Results Tracker and click + Log Result. Record the actual performance of each variant, the measured lift, and whether it reached statistical significance.
Not sure if your result is significant? Use the Significance Calculator tab. Enter your visitors and conversions for both variants, and it will run a chi-squared test and tell you whether the difference is real or noise.
Enter the number of visitors and conversions for both variants. The calculator computes conversion rates, relative lift, the chi-squared statistic, and a p-value. The verdict tells you if the result is actionable:
• Significant (p < 0.05) — Safe to act. Scale the winner.
• Directional (p < 0.10) — Promising but not conclusive. Collect more data before committing budget.
• Not significant (p ≥ 0.10) — Cannot distinguish from noise. Keep running or redesign the test.
The Priority Matrix automatically sorts all your tests into four quadrants based on the Impact and Effort scores you assigned. Use this to decide what to run next: Quick Wins (high impact, low effort) go first. Big Bets (high impact, high effort) need careful planning. Deprioritize everything else.
After reviewing results, go to Decision Log and click + Log Decision. For every completed test, record:
• Outcome — Scaling (winner found), Learned (insight gained), Killed (clear loser), or Inconclusive (need more data).
• What was learned — The actual insight, not just "variant B won." Why did it win? What does it tell you about the audience?
• Action taken — What changed as a result? Budget reallocation? New creative? New audience segment?
This log becomes the team's institutional memory. Six months from now, it tells you exactly why the current strategy looks the way it does.
This tool runs entirely in your browser. Data is stored in memory during each session. To save your work:
• Click Export JSON on the Dashboard to download a snapshot of all tests, results, and decisions.
• Click Import JSON to restore a previous snapshot when you reopen the tool.
We recommend exporting after every weekly review session so the team always has a backup.
| When | Action | Tab |
|---|---|---|
| Monday | Review active tests, update statuses | Dashboard |
| Wednesday | Check if any tests hit sample size, log results | Results Tracker |
| Friday | Log decisions from completed tests, plan next tests | Decision Log → Planner |
| Monthly | Review priority matrix, export data, rebalance budget | Priority Matrix → Export |
| Test | Layer | Status | Timeline | Primary Metric | Priority |
|---|
| Test Name | Hypothesis | Variant A | Variant B | Success Metric | Min. Sample | Layer | Status |
|---|
| Test | Variant A Result | Variant B Result | Lift | Significant? | Sample Size | Date |
|---|
Enter your test data below. The calculator uses a chi-squared test to determine whether the difference between variants is statistically significant at the 95% confidence level.
Significance threshold: p < 0.05 (95% confidence). For low-volume B2B tests, consider using p < 0.10 as a directional signal.