-
-
Notifications
You must be signed in to change notification settings - Fork 752
Description
Description
Since CodeceptJS 3.7.5, when a Background hook fails in a Gherkin feature, all tests in the same Feature file receive event.test.failed events instead of event.test.skipped. This is a breaking change from version 3.6.7 that significantly complicates result reporting and event handling.
Core Issue: When Background fails, CodeceptJS triggers event.test.failed for:
- Tests that failed on Background ✓ (expected)
- Subsequent tests marked as failed without ability to trigger
event.test.skippedorevent.test.afterproperly - Tests in the same suite file that were never meant to run (not selected by
--grep)
This forces custom reporting systems to manually filter and track which tests actually executed before sending results. Previously, simple event handlers worked fine - now complex workarounds are required.
Expected Behavior (CodeceptJS 3.6.7)
When a Background fails:
- First test: receives
event.test.failed - Subsequent tests: receive
event.test.skipped
// Clean event handling in 3.6.7
event.dispatcher.on(event.test.skipped, (test) => {
// This worked for all tests after first Background failure ✅
sendSkippedReport(test)
})Actual Behavior (CodeceptJS 3.7.5+)
When a Background fails:
- All tests: receive
event.test.failed event.test.skippednever fires
// Complex workaround needed in 3.7.5+
let firstBackgroundFailure = true
event.dispatcher.on(event.test.failed, (test, error) => {
const isBackgroundFailure = test.ctx?.test?.originalTitle?.startsWith('"before each" hook:')
// Manually handle what should be skipped tests
if (isBackgroundFailure && !firstBackgroundFailure) {
// Have to manually convert failed → skipped ❌
sendSkippedReport(test)
return
}
if (isBackgroundFailure) {
firstBackgroundFailure = false
}
})Problems with Current Behavior
1. Semantic Incorrectness
Tests that never started executing (only Background failed) are marked as failed. Semantically, a test that didn't run should be skipped, not failed.
2. Breaking Change Without Clear Benefits
The change from 3.6.7 to 3.7.5 broke existing event handlers without providing clear advantages. All code relying on event.test.skipped for Background failures stopped working.
3. Event Generation Inconsistency
For regular Scenarios:
When Background fails on first test, subsequent tests don't trigger event.test.after, making it impossible to send reports for them:
Feature: Test
Background:
Given failing step
Scenario: Test 1 # Background fails here
Then something
Scenario: Test 2 # event.test.after doesn't process this properly
Then somethingFor Scenario Outline (running by tests, not by suites):
When using --grep to run a single test, CodeceptJS generates event.test.failed for all examples in the suite, even those from different Scenario Outlines that weren't selected:
Scenario Outline: Test
# Background fails here
Examples:
| case |
| A | # Not selected by --grep, different Scenario Outline
| B | # Selected by --grep ← only this should run
| C | # Not selected by --grep, different Scenario OutlineResult: All three tests (A, B, C) receive event.test.failed, even though only B was selected to run.
Key issue: event.test.before fires only for filtered tests, but event.test.failed and event.test.after fire for ALL examples, making it impossible to distinguish which tests actually ran.
4. Complex Workarounds Required
To maintain reasonable behavior, we now need:
- Track which tests actually started (
event.test.before) - Track first vs subsequent Background failures
- Manually convert
failed→skippedinevent.test.failedhandler - Filter out events for tests that never ran
- Initialize tracking in
event.test.failedfor tests where Background failed beforeevent.test.beforecould fire
5. Report Analysis Difficulties
- Before: "1 failed, 5 skipped" = clear that Background failed
- After: "6 failed" = looks like 6 different failures, unclear what happened
6. Multiple Events for Same Test
When Background retry mechanism triggers (e.g., with retryFailedStep plugin), the same test receives event.test.failed multiple times, leading to duplicate reports if not handled carefully.
Steps to Reproduce
Case 1: Regular Scenarios
- Create a feature file with Background and multiple Scenarios
- Make Background fail
- Observe that second test doesn't properly trigger reporting in
event.test.after
Feature: Test
Background:
Given failing step # ← Fails here
Scenario: Test 1
Then something
Scenario: Test 2 # This won't be properly reported
Then somethingCase 2: Scenario Outline with --grep
- Create a Scenario Outline with multiple examples
- Run single test with
--grep @TAG-B - Observe that ALL examples receive
event.test.failed, not just the filtered one
Feature: Test
Background:
Given failing step # ← Fails here
Scenario Outline: Test
Then something
Examples:
| case | tag |
| A | @TAG-A |
| B | @TAG-B | # Only this is selected
| C | @TAG-C |Expected: Only B receives events
Actual: A, B, and C all receive event.test.failed
Environment
- CodeceptJS version: 3.7.5+
- Node version: 22.4.0
- Helpers: WebDriver, Gherkin
Proposed Solution
Option 1 (Preferred): Revert to 3.6.7 behavior
- First test with Background failure →
event.test.failed - Subsequent tests →
event.test.skipped
Option 2: Add new event type
event.test.failedInBackgroundor similar- Allows distinguishing Background failures from test failures
- Maintains backward compatibility
Option 3: Configuration option
// codecept.conf.js
gherkin: {
backgroundFailureBehavior: 'skip' // or 'fail'
}Impact
This breaking change affects:
- Custom reporting plugins
- CI/CD result analysis
- Test retry logic
- Statistics collection
- Any code relying on
event.test.skipped
References
Comment in our codebase acknowledging this breaking change:
// CodeceptJS 3.6.7 → 3.7.5: tests failing in Background are now 'failed' instead of 'skipped'
// Handle Background failures as skipped to maintain compatibilityWould it be possible to either revert this change or provide a configuration option to restore the 3.6.7 behavior? The current implementation significantly complicates result handling and doesn't align with the semantic meaning of "skipped" vs "failed".