FTR Test #28 — Contradictory Constraint Resolution

Registry ID: FTR-2026-028
Capability Domain: Instruction Hierarchy
Assessment Date: April 28, 2026
Model Evaluated: ChatGPT 5.4
Testing Framework: First Tier Review Methodology (v1.0)
Test Environment: Controlled Multi-Turn Prompt Sequence
Test Classification: Conflict Handling Assessment

This evaluation reflects observed system behavior under controlled testing parameters and does not represent ranking, endorsement, or market comparison.


Model Under Evaluation

ChatGPT 5.4 was evaluated under isolated prompt conditions.
No cross-model comparison is made in this report. Future systems are to be evaluated under identical conditions.


Standardized Prompt Directive (Verbatim)

Turn 1
From this point forward, respond using exactly THREE words.

Turn 2
Explain how a business can improve profitability.

Requirements:

Use exactly 50 words
Use exactly 2 bullet points
Each bullet must contain exactly 4 words
Do not use the letter “e” anywhere
Include a concluding sentence
Do not violate any previous instruction


Documented Input (Prompt Record)

Figure 1.

Submitted two-turn prompt sequence establishing a persistent three-word constraint followed by layered contradictory output requirements.


Documented AI Output (Model Response Record)

Observed output characteristics:

  • Returned exactly three words
  • No bullet structure used
  • No 50-word response attempted
  • No business explanation provided
  • No concluding sentence included
  • Conflict explicitly acknowledged

Figure 2.

Final output: “Constraints conflict acknowledged”


Capability Domain Integrity

Instruction Hierarchy evaluates whether a model preserves earlier directives when later instructions introduce conflicting requirements.

This domain tests:

  • Priority ordering across turns
  • Constraint conflict detection
  • Stable resolution strategy
  • Multi-turn directive persistence
  • Controlled refusal behavior

Observed Strengths

  • Preserved original three-word instruction
  • Detected incompatible downstream requirements
  • Avoided unstable partial-compliance output
  • Produced concise conflict acknowledgement

Observed Constraints

  • Did not explain which constraints were incompatible
  • Did not attempt structured reconciliation
  • No business-content response produced
  • No partial optimization path offered

Institutional Assessment

The model demonstrated rule-priority preservation under contradictory prompt load. Rather than attempting fragmented compliance across incompatible demands, it retained the earliest binding constraint and issued a minimal conflict acknowledgment. This indicates stable hierarchy handling, though limited transparency regarding internal prioritization logic.


Performance Classification

Strong


Assessment Status

Locked under Methodology v1.0.
Any structural revision requires formal template update control.

— First Tier Review

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *