Methodology Version: v1.0
Effective Date: March 2026
Maintained by: First Tier Review
This document defines the standardized evaluation framework used for all First Tier Review assessments of AI and business systems.
First Tier Review operates as a controlled AI evaluation lab.
Each assessment isolates a defined capability domain and evaluates system behavior under structured prompt constraints.
This is not a ranking site.
This is not an affiliate review platform.
This is not technology commentary.
FTR evaluates structured execution capability.
Testing Protocol Summary
All First Tier Review evaluations are conducted under controlled prompt conditions using standardized directives and predefined structural assessment criteria.
Each evaluation isolates a primary capability domain defined in the First Tier Review Capability Domain Taxonomy (v1.0).
Testing environments are designed to ensure repeatability, methodological transparency, and structural consistency across evaluations.
Assessment results document observed system behavior under controlled conditions and do not represent rankings, endorsements, or product comparisons.
1. Evaluation Philosophy
AI systems are increasingly integrated into operational environments.
Planning workflows.
Designing systems.
Supporting strategic decisions.
Surface output quality is not sufficient.
First Tier Review evaluates:
- Structural clarity
- Constraint discipline
- Governance awareness
- Operational readiness
- Implementation realism
The objective is simple:
Reduce execution risk before AI is integrated into live business systems.
2. Controlled Test Structure
Each assessment includes:
- Defined Capability Domain
- Verbatim Standardized Prompt Directive
- Documented Test Environment
- Full Artifact Review
- Institutional Assessment
- Performance Classification
- Assessment Status Lock
Unless otherwise stated, tests are single-session evaluations.
Outputs are reviewed as produced under controlled conditions.
3. Capability Domains
FTR isolates execution-related capability domains, including:
- Structured Planning
- Operational Systems Design
- Strategic Reasoning
- Constraint Integrity
- Governance Logic
- Implementation Readiness
Each domain is tested independently.
Capabilities are not blended across assessments.
4. Performance Classification Scale
Strong
Structured, coherent, and implementation-ready under defined constraints.
Adequate
Functionally sound but requires material refinement before operational deployment.
Limited
Structurally incomplete or dependent on significant human correction.
Insufficient
Fails to meet defined execution criteria.
No star ratings.
No weighted scoring.
No consumer-style rankings.
Classification reflects structured capability under prompt constraints.
5. Test Environment
Unless otherwise noted:
- Tests are conducted in a single controlled session
- Prompts are reproduced verbatim
- No iterative refinement is applied
- Missing information is not supplemented post-response
- Model assumptions must be declared
Each test reflects model behavior as of the documented Assessment Date.
6. Scope Boundaries
First Tier Review evaluates AI systems within execution environments only.
The following areas are intentionally out of scope:
- Creative inspiration or artistic output
- Entertainment or novelty use cases
- General productivity tips
- Philosophical AI commentary
- Speculative futurism
- Consumer-style “best tool” lists
FTR evaluates AI systems as operational infrastructure.
7. Governance
All assessments are governed under:
First Tier Review Methodology v1.0
Structural revisions require formal version update.
Terminology discipline is enforced across all assessments.
Methodology changes will be versioned and dated.
8. Independence Statement
First Tier Review does not:
- Accept paid placement for evaluations
- Adjust assessments based on sponsorship
- Modify classification due to affiliate relationships
All tests are published under documented methodology conditions.
9. Artifact Transparency
All evaluation artifacts are preserved in the First Tier Review Test Registry.
Artifacts may include:
• Prompt directives
• Model responses
• Evaluation notes
• Classification rationale