AWS GPU instance baseline
A minimal cloud baseline focused on identity, networking, and cost control.
Key takeaways
- Cloud baselines fail at identity, networking, and instance drift.
- A small, repeatable path beats a big, flexible one.
- Verify endpoints before wiring the app.
This run was about learning the AWS shape, not squeezing performance. For system boundaries and failure loops, see Systems 001: Foundations.
Architecture map
The AWS mental model is a VPC boundary with security groups and IAM as the two control planes.
What happened
My first run failed even with valid keys. The issue was not the model. It was a security group rule that blocked the inbound port and a missing IAM permission that made the instance look healthy but unusable.
The two gates
The first gate is IAM. If the role is wrong, you can see the instance but cannot use it.
The second gate is networking. Security groups block traffic by default; your app must be explicitly allowed.
Portal walkthrough
- Create a VPC or reuse the default one for the first pass.
- Launch a GPU instance and attach a minimal IAM role.
- Open the inbound port for your app and database access.
- Create a Postgres instance inside the same VPC.
- Collect the endpoint, key, and host values for config.
First-time config
export LLM_HOST="http://<instance-ip>:<port>"
export DATABASE_URL="postgresql://<user>:<password>@<db-hostname>:5432/<db-name>"
Quick checks
nc -vz <instance-ip> <port>
psql "postgresql://<user>:<password>@<db-hostname>:5432/<db-name>"
Failure modes
- IAM role looks correct but lacks the specific service permission.
- Security group allows the port but the instance subnet is private.
- Using a different region for the database adds invisible latency.
What made the difference
I treated IAM and security groups as first-class parts of the architecture, not afterthoughts. Once those gates were clear, the rest was predictable.
What I would do next time
I would pin the AMI, keep everything in a single region, and log network rules alongside app config.