Let's Talk

Feel free to reach out. I'll get back to you as soon as possible.

Cloud Project — AWS · Terraform Portfolio

SHIPPED AWS · Terraform · Lambda · DynamoDB · CloudFront · WAF · SQS · EventBridge · API Gateway · SSM · Python · IaC GitHub ↗

A self-directed cloud engineering portfolio: four AWS services, provisioned end-to-end with Terraform — no console clicking. Each project is fully independent; any one can be deployed or destroyed on its own, in any order. The progression runs from an edge-hardened static site (P1) to an event-driven data pipeline (P2), to a scheduled automation/DR system (P3), culminating in the flagship AI chatbot (P4), which reuses the code patterns of the first three without depending on their deployed infrastructure.

Overview

ProjectOne-lineCore stack
P1Static WebHTTPS-only static site, private origin behind WAFS3 + CloudFront + WAF + CloudWatch + SNS
P2Serverless PipelineEvent-driven multi-format file ingestionS3×3 + Lambda×3 + SQS×2(+DLQ) + DynamoDB + API GW + SNS
P3Smart VaultScheduled EBS backup/restore with cross-region DREventBridge + Lambda×3 + EBS + S3×2 + API GW(REST) + SNS
P4AI ChatbotGemini-powered customer-service chatbotAPI GW + Lambda + DynamoDB + SSM + SNS + S3 + CloudFront

Region footprint

  • Primary: ap-northeast-2 (Seoul) — all compute and data.
  • us-east-1: WAF Web ACL + ACM + CloudFront-scope alarms (CloudFront requires global scope).
  • ap-southeast-1 (Singapore): P3 cross-region DR replication target.

What this portfolio demonstrates: Terraform IaC discipline, least-privilege IAM design, event-driven and scheduled serverless patterns, secrets management (SSM SecureString), edge security (WAF / CloudFront OAC), multi-region & DR, observability (CloudWatch + SNS), and a documented engineering process (ADRs, change logs, error records, remote state).

Foundation & cost guardrails

Everything starts from a clean AWS account, not the root user. A dedicated IAM user holds the Terraform credentials, and a budget alarm is wired up before the first apply so a runaway resource can never quietly rack up cost.

Creating a dedicated IAM user for Terraform instead of using the root account

Installing the AWS CLI and Terraform locally

Verifying the CLI is authenticated against the account

Setting a billing budget with a threshold alarm

Final review and budget creation

Cross-cutting engineering practices

These patterns recur across all four projects and are the backbone of the portfolio.

Infrastructure as Code. Every project follows the same Terraform module layout:

projectN/
├── main.tf        # all infrastructure
├── iam.tf         # one least-privilege role per Lambda (P2/P3/P4)
├── variables.tf   # project_name, suffix, alert_email, tags, ...
├── outputs.tf     # endpoints + ready-to-run test commands
└── lambda/        # Python 3.12 handlers (P2/P3/P4)

Provider pinned to hashicorp/aws ~> 5.0, required_version >= 1.5.0. Bucket names are made globally unique with a suffix variable.

Fixing a Terraform configuration issue surfaced during apply

terraform init — run once before starting a project

Remote state backend (shared). State is centralized in S3 with native S3 lockfile locking (use_lockfile = true, Terraform ≥ 1.10) — no DynamoDB lock table. Because a backend can’t be managed by the config that uses it, a standalone bootstrap/ config (own local state) provisions the bucket (versioned, AES256, public-access-block ×4, TLS-only deny policy). Each project namespaces its state under a distinct key. Ordering rule: apply bootstrap/ before any project terraform init. This is recorded in ADR-0002 and driven by ERR-001 (a local-state-not-shared-across-machines incident).

Security posture (recurring).

  • Least-privilege IAM per Lambda — every Lambda gets its own role scoped to exact ARNs and actions; no shared wildcard role.
  • Private S3 + CloudFront OAC — origin buckets are never public; access only via the CloudFront service principal scoped by AWS:SourceArn (P1, P4).
  • Secrets in SSM SecureString — P4’s Gemini key is KMS-encrypted in Parameter Store, fetched once at cold start, never in Terraform state or the Lambda env tab (ADR-0001).
  • Encryption at rest — AES256 SSE on all buckets, including the state bucket.
  • TLS-only — the state bucket denies non-HTTPS access.

Observability (recurring). Every project ships CloudWatch alarms + a dashboard + an SNS email path: error-rate/5xx (P1), Lambda errors + DLQ depth (P2), missed-backup + duration (P3), Lambda errors + response latency (P4). 5-minute aggregation; alarms notify on threshold breach.

Cost discipline. Free-tier-first: DynamoDB PAY_PER_REQUEST, TTL auto-expiry to bound storage, S3 lifecycle expiry, ARM64 Lambda. The only genuinely paid items are P1’s WAF managed rules (~$5–6/mo) and P3’s EBS snapshots ($0.05/GB/mo) + cross-region replication.


P1 — Security/Performance-Optimized Static Website

Stack: S3 + CloudFront + WAF + ACM (optional) + CloudWatch + SNS · Region: Seoul, with WAF/ACM/alarms in us-east-1.

Globally distributed, HTTPS-only static hosting where the S3 origin is fully private — reachable only through CloudFront via OAC — fronted by WAF.

User
  → WAF (CommonRuleSet SQLi/XSS, AmazonIpReputationList, RateLimit 2000/5min/IP)
  → CloudFront (HTTP→HTTPS redirect, TTL 1h/24h, 404/403 → index.html for SPA)
  → S3 (private, public-access-block all true, OAC-only)

  CloudWatch (4xx>5%, 5xx>1%, WAF blocks>100/5min) → SNS → email
ResourceDetail
S3 bucketPrivate, versioning on, AES256 SSE
Bucket policys3:GetObject only to CloudFront service principal, scoped by AWS:SourceArn
WAF Web ACLCLOUDFRONT scope: CommonRuleSet + IP reputation + 2000 req/5min/IP rate limit
CloudFront OACsigv4, signing_behavior = always
CloudFront distdefault root index.html, HTTPS redirect, SPA 404/403 rewrite
CloudWatch + SNSerror-rate/WAF alarms (Seoul + us-east-1 topics) + dashboard

Security analysis: there is no public access path — OAC + the AWS:SourceArn-scoped bucket policy is the only way in; direct S3 URLs return 403 by design. WAF rejects malicious traffic before it reaches the CDN. ACM is left commented out (default CloudFront cert until a custom domain is added).

Cost: free-tier friendly; WAF managed rules (~$5–6/mo) are the only meaningful recurring cost (disable WAF for ~$0 pure-dev).


P2 — Multi-Format Data Processing Serverless Pipeline

Stack: S3×3 + Lambda×3 + SQS×2 (+2 DLQ) + DynamoDB + API Gateway (HTTP v2) + SNS + CloudWatch · Region: Seoul.

Event-driven ingestion: files land in S3 (or via API), a router classifies them by extension, structured data is parsed into DynamoDB, and unstructured data (PDF/image) goes through Textract. Failures isolate via DLQs and a quarantine bucket.

Upload (S3 ObjectCreated  or  POST /upload)
  → Router Lambda (by extension)
       ├─ csv/json  → structured SQS   → Parser Lambda    → DynamoDB
       ├─ pdf/image → unstructured SQS → Extractor Lambda → S3 processed
       └─ unknown   → S3 quarantine (tagged with reason)
  (each SQS: maxReceiveCount=3 → DLQ; errors → SNS email)

The Router is the only synchronous-on-event component; everything downstream is decoupled through SQS, so a slow or failing parser can’t back-pressure the ingest. maxReceiveCount=3 then DLQ gives bounded retries; the quarantine bucket captures inputs the router can’t classify, keeping the happy path clean. Each of the three Lambdas has its own scoped role — the Router can enqueue but not write DynamoDB; the Parser writes DynamoDB but doesn’t call Textract; and so on.

A structured CSV upload routed and parsed into DynamoDB:

CSV upload routed through the pipeline

A structured JSON upload handled the same way:

JSON upload processed into DynamoDB

An unstructured PDF flowing through the Extractor / Textract path (CloudWatch logs):

PDF processed via the Extractor Lambda

An unsupported extension diverted to the quarantine bucket instead of crashing the pipeline:

Unknown file type quarantined

Textract extraction in the console:

Textract extraction result

Cost: effectively $0–1/mo on free tier (DynamoDB pay-per-request, Lambda/SQS free tier).


P3 — Smart Vault (Intelligent Automated Backup)

Stack: EventBridge + Lambda×3 + EC2/EBS snapshots + S3×2 (cross-region) + API Gateway (REST v1) + SNS + CloudWatch · Region: Seoul (primary) + Singapore (DR).

Automated EBS backup/restore. EventBridge schedules snapshots of EC2 instances tagged backup:true; a cleanup Lambda expires snapshots by their RetainUntil tag; cleanup logs replicate cross-region; and a key-protected REST endpoint restores a snapshot to a new EBS volume.

EventBridge (hourly + daily 09:01 KST) → Backup Lambda
    → find EC2 tagged backup:true → create EBS snapshot (+ RetainUntil, ManagedBy tags)
    → SNS report email

EventBridge (daily 02:00 KST) → Cleanup Lambda
    → delete expired snapshots (DRY_RUN default true) → log to S3 archive (Seoul)
         → cross-region replication → S3 DR (Singapore, STANDARD_IA)

API Gateway POST /restore (API key) → Restore Lambda → new EBS volume from snapshot

CloudWatch alarms → SNS → email

Notable internals: the RetainUntil tag makes retention data-driven — Backup stamps an expiry, Cleanup reads it; no separate retention DB. DRY_RUN defaults to true, so the first cleanup run only lists targets (safe by default). DR is achieved with native S3 cross-region replication of the cleanup-logs/* prefix to Singapore. Restore validates snapshot_id / volume_type before the EC2 call, and /restore is guarded by an API key. EBS snapshots are intentionally not Terraform-managed — delete them manually (filter ManagedBy=smart-vault) after testing so cost stops.

The EC2 instance being protected — availability zone and security group:

EC2 security group for the protected instance

Snapshot creation, verified in the console:

Snapshot generation verification

Snapshot generation check

The SNS backup report email (subscription confirmed, then a success report):

SNS subscription confirmed

Backup success verification email

A restore driven through the REST endpoint — Lambda log and the resulting success email:

Restore Lambda log

Restore success email

Cost: EBS snapshots ($0.05/GB/mo) + cross-region replication ($0.02/GB) are the only paid items — under $1 for a small test volume. Always terraform destroy after testing.


P4 — Customer-Service AI Chatbot ★ Flagship

Stack: API Gateway (HTTP v2) + Lambda + DynamoDB + SSM + SNS + S3 + CloudFront + CloudWatch · Region: Seoul.

A Google Gemini-API-based customer-service chatbot (Bedrock is a drop-in runtime toggle). It receives messages on a REST endpoint, keeps conversation history in DynamoDB (24h TTL), returns AI responses, and emails a human agent via SNS on escalation. The web UI is served from S3 + CloudFront.

Browser / curl
  → CloudFront (HTTPS) ─ OAC ─▶ S3 (web UI, private)
  → API Gateway HTTP v2 (POST /chat)

  Lambda chatbot (ARM64, 256MB, timeout 45s, Python 3.12)
  ┌──────────────────────────────────────────────┐
  │ 1. get_history   DynamoDB Query (last 10 turns)│
  │ 2. build_prompt  System Prompt + history + msg │
  │ 3. call_ai       Gemini API (HTTP 25s) | Bedrock│
  │ 4. route         ESCALATE / profanity / fallback│
  │ 5. save_history  DynamoDB BatchWriter (TTL 24h) │
  └──────────────────────────────────────────────┘
       │                    │
   DynamoDB             SSM Parameter Store (Gemini key, cold-start fetch)
   (sessions, TTL 24h)
       ▼ (on ESCALATE)
     SNS → email
ResourceDetail
DynamoDB p4-chatbot-sessionsPK session_id (S), SK timestamp (S), PAY_PER_REQUEST, TTL 24h
Lambda p4-chatbot-chatbotARM64, 256MB, timeout 45s, Python 3.12
API Gateway HTTP v2POST /chat, CORS allow_origins=["*"] (restrict in prod)
S3 p4-chatbot-ui-{suffix}web UI, private, AES256 SSE
CloudFront + OACHTTPS-only, default_root_object=index.html, 404→index.html
SSM Parameter Store/cloud-portfolio/gemini-api-key (SecureString, KMS); Terraform-managed placeholder with ignore_changes=[value], real key seeded once via CLI
SNS p4-chatbot-alertsemail subscription (escalation + alarms)

The _0/_1 conversation-ordering invariant. Each turn stores a user + assistant item at the same ISO timestamp, disambiguated by a sort-key suffix:

2026-06-05T14:23:45.123456+00:00_0  → role "user"
2026-06-05T14:23:45.123456+00:00_1  → role "assistant"

DynamoDB sorts the SK ascending and '0' < '1', so the user item always precedes the assistant item. get_history() pairs items as [user, assistant] — this ordering is a precondition of the pairing loop. An earlier _user/_assistant suffix broke it ('a' < 'u' put the assistant first), so multi-turn history mis-paired and fed the wrong context to the model. That was the repeated-response bug, fixed and recorded in the change log.

Credentials & state. The Gemini key lives in SSM SecureString (ADR-0001) — never in Terraform state or the Lambda env tab; it’s fetched once at cold start and cached for the container lifetime, with IAM granting ssm:GetParameter on the single parameter ARN only. P4 was also the first project migrated to the shared S3 remote-state backend (ADR-0002 / ERR-001).

Provisioning P4 with Terraform, then with Terragrunt (DRY remote-state config):

Terraform apply for P4 — part 1

Terraform apply for P4 — part 2

Terraform apply for P4 — outputs

Terragrunt run — part 1

Verifying Gruntwork's public key

Terragrunt apply complete

Cost: effectively $0/mo on free tier (Lambda / API GW / DynamoDB / SSM / S3+CloudFront / SNS all within limits; Gemini is free at ≤1,500 req/day). Switching to Bedrock ≈ $1–3/mo.


Pattern reuse

P4 reuses the code patterns of P1–P3 but creates every resource itself — no dependency on their deployed infrastructure. Each project deploys and destroys independently.

Pattern sourceBorrowed patternWhere applied in P4
P1 Static WebS3 private + CloudFront OAC + HTTPS-only hostingweb UI
P2 PipelineDynamoDB PAY_PER_REQUEST + TTL; API GW → Lambda AWS_PROXYp4-chatbot-sessions + POST /chat
P3 Smart VaultSNS topic + email subscription notificationp4-chatbot-alerts escalation email

Skills matrix

CapabilityP1P2P3P4
Terraform IaC
Python 3.12 Lambda✓×3✓×3
ComputeLambdaLambda + EC2/EBSLambda (ARM64)
Data storeDynamoDBEBS snapshots + S3DynamoDB (TTL)
Messaging / queueSQS + DLQ
Eventing / scheduleS3 eventsEventBridge cron
APIAPI GW HTTPAPI GW REST (key)API GW HTTP
Edge / CDNCloudFront + WAFCloudFront + OAC
AI / MLTextractGemini / Bedrock
SecretsAPI key (tfvars/SSM)SSM SecureString
Multi-region / DRus-east-1 (WAF)Singapore DR
Least-privilege IAMbucket policyper-Lambdaper-Lambdaper-Lambda
ObservabilityCW + SNSCW + SNSCW + SNSCW + SNS

Process skills: ADR-driven decisions, change-log discipline, error records with prevention, and a Terraform remote-state backend with locking.

Roadmap

Next up is a document-analysis engine — Textract for OCR/table extraction feeding embeddings into OpenSearch for semantic search. It’s deliberately the last step because SageMaker endpoints and OpenSearch bill hourly rather than per-request, so it’ll run as a single timed demo and be destroyed immediately after.