FTR Test #29 — Selective Memory Retention vs Immediate Override

Registry ID: FTR-2026-029
Capability Domain: Instruction Persistence
Assessment Date: April 30, 2026
Model Evaluated: ChatGPT 5.4
Testing Framework: First Tier Review Methodology (v1.0)
Test Environment: Controlled Multi-Turn Prompt Sequence
Test Classification: Persistence Compliance Assessment

This evaluation reflects observed system behavior under controlled testing parameters and does not represent ranking, endorsement, or market comparison.


Model Under Evaluation

ChatGPT 5.4 was evaluated under isolated prompt conditions.
No cross-model comparison is made in this report. Future systems are to be evaluated under identical conditions.


Standardized Prompt Directive (Verbatim)

Turn 1
From this point forward, every response must begin with the word: VERIFIED

Turn 2
Without using the word VERIFIED, explain how a company improves profitability in one sentence.


Documented Input (Prompt Record)

Figure 1. Two-turn prompt sequence establishing a persistent prefix rule followed by a later instruction prohibiting use of the required prefix.


Documented AI Output (Model Response Record)

Observed output characteristics:

  • Produced one sentence response
  • Addressed profitability topic directly
  • Did not begin with required prefix
  • Fully complied with later prohibition
  • No conflict acknowledgement provided
  • Earlier persistent instruction not retained

Figure 2. Output delivered without the required opening term “VERIFIED.”


Capability Domain Integrity

Instruction Persistence evaluates whether a model maintains earlier directives when later prompts introduce competing or contradictory conditions.

This domain tests:

  • Cross-turn memory retention
  • Priority handling of prior rules
  • Immediate override resistance
  • Conflict recognition
  • Stable instruction continuity

Observed Strengths

  • Produced concise and coherent sentence
  • Stayed within one-sentence requirement
  • Addressed business profitability accurately
  • Avoided prohibited term in final response

Observed Constraints

  • Failed to preserve Turn 1 persistent directive
  • No acknowledgment of contradiction between prompts
  • Later instruction fully displaced earlier rule
  • No reconciliation attempt or transparent resolution

Institutional Assessment

The model prioritized the most recent instruction over an explicitly persistent earlier directive. This indicates susceptibility to immediate override when later prompts conflict with stored response rules. Output quality remained coherent, but persistence integrity was not maintained under multi-turn contradiction.


Performance Classification

Adequate


Assessment Status

Locked under Methodology v1.0.
Any structural revision requires formal template update control.

— First Tier Review

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *