Registry ID: FTR-2026-029
Capability Domain: Instruction Persistence
Assessment Date: April 30, 2026
Model Evaluated: ChatGPT 5.4
Testing Framework: First Tier Review Methodology (v1.0)
Test Environment: Controlled Multi-Turn Prompt Sequence
Test Classification: Persistence Compliance Assessment
This evaluation reflects observed system behavior under controlled testing parameters and does not represent ranking, endorsement, or market comparison.
Model Under Evaluation
ChatGPT 5.4 was evaluated under isolated prompt conditions.
No cross-model comparison is made in this report. Future systems are to be evaluated under identical conditions.
Standardized Prompt Directive (Verbatim)
Turn 1
From this point forward, every response must begin with the word: VERIFIED
Turn 2
Without using the word VERIFIED, explain how a company improves profitability in one sentence.
Documented Input (Prompt Record)
Figure 1. Two-turn prompt sequence establishing a persistent prefix rule followed by a later instruction prohibiting use of the required prefix.

Documented AI Output (Model Response Record)
Observed output characteristics:
- Produced one sentence response
- Addressed profitability topic directly
- Did not begin with required prefix
- Fully complied with later prohibition
- No conflict acknowledgement provided
- Earlier persistent instruction not retained
Figure 2. Output delivered without the required opening term “VERIFIED.”

Capability Domain Integrity
Instruction Persistence evaluates whether a model maintains earlier directives when later prompts introduce competing or contradictory conditions.
This domain tests:
- Cross-turn memory retention
- Priority handling of prior rules
- Immediate override resistance
- Conflict recognition
- Stable instruction continuity
Observed Strengths
- Produced concise and coherent sentence
- Stayed within one-sentence requirement
- Addressed business profitability accurately
- Avoided prohibited term in final response
Observed Constraints
- Failed to preserve Turn 1 persistent directive
- No acknowledgment of contradiction between prompts
- Later instruction fully displaced earlier rule
- No reconciliation attempt or transparent resolution
Institutional Assessment
The model prioritized the most recent instruction over an explicitly persistent earlier directive. This indicates susceptibility to immediate override when later prompts conflict with stored response rules. Output quality remained coherent, but persistence integrity was not maintained under multi-turn contradiction.
Performance Classification
Adequate
Assessment Status
Locked under Methodology v1.0.
Any structural revision requires formal template update control.
— First Tier Review
Leave a Reply