GCP GPU instance baseline
A first cloud baseline with focus on projects, IAM, and network rules.
Key takeaways
- GCP projects are real boundaries, not just labels.
- IAM and firewall rules are the actual control planes.
- Keep the first run narrow and repeatable.
This run was about understanding GCP’s platform shape, not performance. If you want the RAG fundamentals, see Retrieval-Augmented Generation in plain terms.
Architecture map
GCP centers around projects. Everything inherits IAM and network rules from the project boundary.
What happened
The instance was live, but my app could not reach it. The firewall rule existed, but it was attached to the wrong network tag. The result looked like a code bug, but it was not.
The two gates
The first gate is IAM. If the account does not have the right permissions, the service looks available but fails on use.
The second gate is firewall rules. The rule has to match the instance tags and the network you actually use.
Portal walkthrough
- Create a new GCP project for this experiment.
- Enable the compute and AI services you need.
- Create a GPU instance with a single, known image.
- Add a firewall rule for the port you will hit.
- Create the database and keep it in the same region.
First-time config
export LLM_HOST="http://<instance-ip>:<port>"
export DATABASE_URL="postgresql://<user>:<password>@<db-hostname>:5432/<db-name>"
Quick checks
nc -vz <instance-ip> <port>
psql "postgresql://<user>:<password>@<db-hostname>:5432/<db-name>"
Failure modes
- Firewall rule exists but does not match the instance tag.
- Services are not enabled on the project.
- Database lives in a different region than the instance.
What made the difference
I treated project setup as an architectural decision, not a form. Once the project, IAM, and firewall rules aligned, the rest of the setup behaved.
What I would do next time
I would verify project services first, pin the instance image, and keep every resource in one region.