Prototype on available compute
Early work may run on a developer workstation, notebook environment, rented GPU instance, or managed model API. The goal is learning, not perfect architecture.
- Measure memory needs and response behavior.
- Identify whether retrieval, tuning, or batch jobs are required.