Loading episodes…
0:00 0:00

Draft: Tool Portability & Alternatives Response

00:00
BACK TO HOME

Draft: Tool Portability & Alternatives Response

10xTeam May 22, 2026 7 min read

Raw response to “is this specific to Claude or any LLM?” — saved for editorial review before publishing.

What’s Claude-Specific

Claude Code features (non-portable)

  • Auto-memory location (/home/ahmed/.claude/projects/...) — Claude Code’s feature.
  • CLAUDE.md auto-loading — built into Claude Code; other tools don’t do this.
  • Prompt cache with 5-min TTL — Anthropic’s specific implementation.
  • 200k context window — my specific limit (GPT-4 has 128k, varies by tool).

If you switch tools, you lose these affordances.

Claude’s architecture (somewhat portable)

  • How I process requests and apply instructions.
  • My inference speed / token accounting.

Other models work differently; you’d adapt the system, not copy it.


What’s Universal (Works Anywhere)

The core problem and core solution are model-agnostic:

Problem: Every LLM has a context limit. No built-in memory. Each conversation starts fresh.

Solution: Store decisions outside the model. Load on-demand. Organize by tier (lean summaries, deep docs, session context).

This works with any LLM. The files are plain Markdown + JSON (open standards). The philosophy transfers.


How Other Tools Could Implement This

Option 1: ChatGPT + Custom Instructions (free, limited)

Store memory in: Custom Instructions (static, global).

## My Projects
- PyAcademy: Learning framework. Status: Phase 0 (see GitHub repo).
- DarJS: SMB framework. Status: Phases 1-6 done (258 tests).

## Cross-project conventions
- Always write tests. Use Vitest for JS.
- Modular by default. No >500-line files.

Pros: Free, always loaded.
Cons: Can’t do per-project memory, 150k token limit for instructions (you can fit ~20 projects max), no lazy-loading.

Option 2: OpenAI Assistants API (paid, structured)

Store memory in: Attached files + vector retrieval.

assistant = client.beta.assistants.create(
    name="Ahmed's dev assistant",
    tools=[{"type": "code_interpreter"}, {"type": "retrieval"}],
    file_ids=[  # Attach memory files
        "file-pycademy-memory.json",
        "file-darjs-roadmap.md",
    ]
)

Pros: Built-in file handling, semantic search, per-project organization.
Cons: Slower (retrieval → embedding search → LLM), API costs, less fine-grained control.

Option 3: LangChain + Vector DB (powerful, complex)

Store memory in: Pinecone / Weaviate / Milvus (semantic vectors).

from langchain.memory import EntityMemory
from langchain.vectorstores import Pinecone

memory = EntityMemory(llm=llm, vectorstore=pinecone_index)
# Stores decisions, queries them by semantic similarity

Pros: Semantic search (“what did we decide about testing?”), scales to unlimited memory.
Cons: Overkill for structured projects, infrastructure overhead, costs.

Option 4: Plain GitHub (free, durable)

Store memory in: Repo files + commit history.

projects/
├── pycademy/
│   ├── README.md (project status)
│   ├── ANALYSIS.md (issues)
│   ├── ROADMAP.md (phases)
│   └── memory.md (lean summary)
├── darjs/
│   ├── ...

Pros: Version control, searchable, durable, costs nothing.
Cons: Manual copy-paste into conversation, no auto-loading, LLM doesn’t know to check it.

Option 5: Anthropic Files API (newer, designed for this)

Store memory in: Anthropic’s Files API.

with open("pycademy_memory.json", "rb") as f:
    response = client.beta.files.upload(file=f)

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a developer assistant."
        },
        {
            "type": "document",
            "source": {
                "type": "file",
                "file_id": response.id,
            }
        }
    ],
    messages=[{"role": "user", "content": "Resume PyAcademy work."}]
)

Pros: Files API is designed for exactly this (load docs without token cost).
Cons: Slightly newer, requires API usage (not web interface).

Option 6: Ollama / Local Llama (self-hosted)

Store memory in: Local SQLite + custom retrieval.

# You build your own system:
# - Save decisions to SQLite
# - Embed with local model (e.g., nomic-embed-text)
# - Retrieve top-K on every request
# - Inject into system prompt

Pros: Complete control, unlimited memory, no API costs.
Cons: Infrastructure work, embedding quality depends on model.


Comparison Matrix

Tool Memory storage Auto-load? Lazy-load? Cost Portable?
Claude Code (current) .claude/projects/ Free ❌ (Claude-specific)
ChatGPT Custom Instructions Global text Free
OpenAI Assistants Files API Manual $$
LangChain + vector DB Vector DB Manual $$$
GitHub + manual Git repo Free
Anthropic Files API Files API Manual $
Ollama + custom SQLite Custom None

My Recommendation for Other Tools

If you’re using ChatGPT:

  • Store project memory in GitHub (or a shared doc).
  • Start each conversation with “I’m working on X” + paste the latest memory.md.
  • Use Custom Instructions for cross-project conventions (naming, testing, language choices).

If you’re using OpenAI API programmatically:

  • Use Assistants API with attached files. Let it do retrieval.
  • Or use Anthropic’s Files API if you can switch (designed for this, cheaper).

If you’re using local Llama:

  • Build a custom retrieval system (SQLite + embeddings).
  • You have full control; make it perfect for your workflow.

If you’re using web tools (Claude.ai web, ChatGPT web):

  • Store memory in GitHub / Notion / external docs.
  • Manually paste relevant docs at the start of a conversation.
  • Use the tool’s “custom instructions” or “system prompt” for global rules.

What to Write if You Share This

When you share the blog post with other readers, add a disclaimer:

## Note on Tool Portability

This essay describes a system built specifically for **Claude Code** 
(which has auto-memory and file-system integration). 

**The philosophy** (three-tier memory, lazy-loading, lean summaries) 
works with any LLM. **The implementation** (`.claude/` directories, 
CLAUDE.md auto-loading) is Claude-specific.

If you use ChatGPT, Gemini, or local models:
- Store memory in GitHub / Notion / your file system.
- Manually load relevant docs at the start of each conversation.
- Use the tool's "system instructions" or "custom instructions" for global rules.
- See the [Tool Comparison](#comparison-matrix) below for alternatives.

The Portable Takeaway

The system isn’t “Claude’s context management.” It’s “how to manage persistent memory when working with any LLM that has a context limit.”

The specifics change (where files live, how they’re auto-loaded), but the pattern holds:

  1. Tier 1: Lean summaries (what are we building?).
  2. Tier 2: Deep docs (how and why?).
  3. Tier 3: Session context (what’s happening now?).

This works with Claude Code, ChatGPT + GitHub, OpenAI API + vector DB, local Llama, or a notebook + manual copy-paste.

The tool is a detail. The pattern is timeless.


Call to Action

If you want, I can write Part 2 of the series addressing this directly: “How to Adapt This System to Your LLM Tool.” Would that be useful?


Status: Draft ready for review. Move to published blog post once approved.
Date saved: 2026-04-24


Join the 10xdev Community

Subscribe and get 8+ free PDFs that contain detailed roadmaps with recommended learning periods for each programming language or field, along with links to free resources such as books, YouTube tutorials, and courses with certificates.

Audio Interrupted

We lost the audio stream. Retry with shorter sentences?