Octopus Deploy on AWS
This article will help with your understanding of Octopus Deploy, EKS, IRSA/Pod Identity, and Cross-Account IAM Roles. If you're coming from Azure, you're used to a world where:
- Identities are centralized in Azure AD (Entra ID)
- Workloads use Managed Identities (system/user-assigned) to get tokens
- RBAC is applied to resources and evaluated at the control plane
- Your Azure DevOps pipeline agent picks up credentials automatically and you just run
azcommands
AWS is similar conceptually but wired very differently. Add Octopus Deploy running inside EKS, throw in multi-account deployments, and suddenly you're juggling:
- EKS OIDC / IRSA / Pod Identity (what even are these?)
- AWS STS and
AssumeRoleflows (chains of role assumptions?) - Octopus Server vs Calamari (wait, which one talks to AWS?)
- Per-step AWS roles and cross-account trust policies (how is this different from a service principal?)
This guide walks through the complete mental model, explicitly mapping AWS concepts to Azure analogies, and using Octopus-in-EKS deploying to multiple AWS accounts as the concrete example.
1. The Players: Who's Who in This System
Let's define every actor in this story so there's no confusion:
AWS Components
- AWS EKS -- Managed Kubernetes, similar to AKS
- EKS OIDC / IRSA -- EKS's mechanism to bind Kubernetes service accounts to IAM roles (like Azure workload identity for AKS)
- EKS Pod Identity -- Newer, AWS-native successor to IRSA that avoids some OIDC complexity
- AWS IAM Role -- Roughly equivalent to Azure AD app registration + role assignment; represents an AWS identity with attached permissions
- AWS STS (Security Token Service) -- Issues short-lived credentials via
AssumeRoleandAssumeRoleWithWebIdentitycalls - AWS Organizations / Multi-Account -- Pattern where Dev, Staging, Prod, and operational tooling live in separate AWS accounts
Octopus Components
- Octopus Server -- The orchestrator/control plane. Runs in your EKS cluster (or could run in ECS, EC2, on-prem)
- Calamari -- The worker subprocess that Octopus spawns to actually execute each deployment step
- Octopus AWS Account -- Configuration in Octopus UI that tells it which AWS identity to use for steps
- Built-in Worker -- When Octopus Server itself runs the step (Calamari subprocess in same pod/container)
- Per-step Role ARN -- Optional override that tells Calamari to assume a different role for that specific step
The Critical Insight You Need First
Calamari (the worker subprocess), not Octopus Server, is what calls AWS STS at runtime.
Octopus Server is pure orchestration -- it decides what runs when, spawns Calamari, and passes configuration. Calamari is the thing that:
- Resolves AWS credentials
- Calls STS to get temporary credentials
- Injects those credentials as environment variables
- Runs your actual deployment script (CloudFormation, kubectl, Terraform, etc.)
If you don't internalize this, the rest won't make sense. Octopus Server never holds or uses AWS credentials for deployment steps. Calamari does everything.
The following diagram shows what happens inside a single Calamari step execution:
flowchart LR
subgraph Inputs
code["Code / Script"]
token["AWS Token\n(from IRSA/Pod Identity)"]
vars["Step Variables"]
end
subgraph Calamari["Calamari Step Execution"]
step["Step Process"]
end
subgraph Actions["Could Be..."]
cf["Apply CloudFormation"]
eks["List EKS Pods"]
ecr["Purge ECR Images"]
tf["Apply Terraform"]
create["Create EKS Cluster"]
end
subgraph CredResolution["Credential Resolution"]
role["var: Role ARN"]
sts["AWS STS"]
iam["IAM Roles"]
end
code --> step
token --> step
vars --> step
step --> cf
step --> eks
step --> ecr
step --> tf
step --> create
vars --> role
role -->|AssumeRole| sts
sts -->|Temp Credentials| role
sts --- iam
Calamari receives the script, the ambient AWS token, and step variables (including the target role ARN). It calls STS to exchange the launcher token for scoped temporary credentials, then executes the actual deployment action -- CloudFormation, Terraform, kubectl, whatever the step calls for.
2. The Azure Mental Model (Your Baseline)
Quick mapping so your brain has familiar anchors:
| Azure Concept | AWS Equivalent |
|---------------|----------------|
| Azure Managed Identity (system/user-assigned) | EKS IRSA / Pod Identity / EC2 instance role |
| Azure AD + OAuth2/OIDC federation | AWS IAM OIDC providers + STS AssumeRoleWithWebIdentity |
| Azure role assignment (Contributor on subscription) | IAM role with permission policy |
| Azure DevOps service connection | Octopus AWS Account |
| Azure DevOps pipeline agent | Octopus Calamari (worker process) |
| az account set + multiple service connections | sts:AssumeRole into different accounts/roles per step |
In Azure DevOps, you might:
- Create a managed identity with
Contributoron a resource group - Your pipeline uses a service connection tied to that identity
- Pipeline agent picks up credentials from metadata service automatically
- Your script just runs
az deployment group createand it works
In AWS with Octopus and EKS, the pattern is similar -- but instead of Azure AD tokens, you have:
- STS temporary credentials
- IAM role trust policies
- Cross-account AssumeRole chains
3. How Octopus Actually Executes a Step: The Full Flow
When you trigger a deployment and a step runs (e.g., "Deploy CloudFormation template" or "Run kubectl script"), here's what happens under the hood:
Step-by-Step Execution
Octopus Server receives the deployment task
- User clicks "Deploy" or webhook fires
- Octopus evaluates which worker should run the step (built-in worker in the Octopus pod, or an external worker)
Octopus Server spawns Calamari
- Calamari is a subprocess/child process
- Octopus passes to Calamari:
- The step script/content (e.g., CloudFormation template, kubectl commands)
- AWS Account configuration (which role to use)
- Any per-step "Assume Role ARN" override
- Step variables and parameters
Calamari resolves AWS credentials (this is the key part)
- Calamari looks for credentials in this order:
AWS_WEB_IDENTITY_TOKEN_FILEenv var (IRSA/Pod Identity injected by EKS)- EC2/ECS metadata service at
169.254.169.254(instance role) - Explicit access keys from Octopus AWS Account config (if configured)
- If Calamari finds
AWS_WEB_IDENTITY_TOKEN_FILE:- It reads the JWT token file
- Calls
sts:AssumeRoleWithWebIdentityusing that token - Gets back temporary credentials for the pod's IAM role
- Calamari looks for credentials in this order:
Calamari performs role assumption (if per-step Role ARN is configured)
- Uses the credentials from step 3 (the "launcher" role)
- Calls
sts:AssumeRoleinto the target role (e.g.,DevDeployRolein Dev account) - Gets back new temporary credentials scoped to that deployment role
Calamari injects credentials as environment variables
- Sets in the step's process environment:
AWS_ACCESS_KEY_ID=ASIA...
AWS_SECRET_ACCESS_KEY=...
AWS_SESSION_TOKEN=...
The actual step script runs
- Your CloudFormation/Terraform/kubectl/AWS CLI commands execute
- They automatically use the injected credentials
- The script doesn't need to call STS or handle auth -- it just works
Credentials expire after the step
- STS credentials are short-lived (default 1 hour, configurable up to 12 hours; role chaining limited to 1 hour)
- Next step goes through the same flow, potentially with different role
Why This Matters
In Azure, the pipeline agent has one identity for the entire run. In AWS with Octopus, each step can have a completely different identity because Calamari does a fresh AssumeRole call per step.
This is the key to multi-account orchestration: your Octopus pod has one minimal "launcher" identity, and every step assumes whichever role it needs in whichever account.
4. Where IRSA and Pod Identity Fit: The "Launcher" Identity
When Octopus runs inside EKS, you need to answer this question:
"What identity does Calamari have when it first tries to call AWS STS?"
In Azure terms: "Which Managed Identity does my pipeline agent use?"
In AWS EKS, you bind a pod's Kubernetes service account to an IAM role using one of two mechanisms:
4.1 IRSA (IAM Roles for Service Accounts) - The Original Approach
How it works:
AWS hosts an OIDC issuer for your cluster
- Every EKS cluster gets a public OIDC endpoint
- URL format:
https://oidc.eks.<region>.amazonaws.com/id/<cluster-unique-id> - This endpoint serves JWT tokens that identify Kubernetes service accounts
You register that OIDC URL as an IAM Identity Provider
- In AWS IAM console -> Identity Providers -> Add Provider
- Provider type: OpenID Connect
- Provider URL: your cluster's OIDC issuer URL
- Audience:
sts.amazonaws.com
You create an IAM role with an OIDC trust policy
{
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/XXXXX"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-east-1.amazonaws.com/id/XXXXX:sub": "system:serviceaccount:octopus:octopus-server",
"oidc.eks.us-east-1.amazonaws.com/id/XXXXX:aud": "sts.amazonaws.com"
}
}
}]
}
This says: "Trust JWTs from my EKS cluster's OIDC issuer, but only for the specific Kubernetes service account octopus/octopus-server"
- You annotate the Kubernetes service account
apiVersion: v1
kind: ServiceAccount
metadata:
name: octopus-server
namespace: octopus
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/OctoLauncherRole
EKS mutating webhook injects environment variables into the pod
- When your pod starts, EKS automatically injects:
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/tokenAWS_ROLE_ARN=arn:aws:iam::123456789012:role/OctoLauncherRole
- Also mounts the JWT token as a file in the pod
- When your pod starts, EKS automatically injects:
AWS SDK automatically picks this up
- When Calamari (or any AWS SDK in the pod) tries to get credentials
- SDK sees
AWS_WEB_IDENTITY_TOKEN_FILEin the environment - Reads the JWT token from that file path
- Calls
sts:AssumeRoleWithWebIdentitywith the token - Gets back temporary credentials for
OctoLauncherRole
Key point: Calamari doesn't know or care that IRSA is happening. The AWS SDK's credential chain automatically handles it.
4.2 EKS Pod Identity - The Newer, Cleaner Approach
Pod Identity is AWS's answer to some complexity and edge cases with IRSA:
How it works:
- Install the EKS Pod Identity Agent add-on
aws eks create-addon --cluster-name my-cluster --addon-name eks-pod-identity-agent
- This deploys a DaemonSet on every node
The agent runs on each node and acts as a credential broker
- Create an IAM role with Pod Identity trust policy
{
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "pods.eks.amazonaws.com"
},
"Action": ["sts:AssumeRole", "sts:TagSession"]
}]
}
Notice: No OIDC provider mentioned at all. The trust is directly with the EKS service.
- Create a Pod Identity Association
aws eks create-pod-identity-association \
--cluster-name my-cluster \
--namespace octopus \
--service-account octopus-server \
--role-arn arn:aws:iam::123456789012:role/OctoLauncherRole
This tells EKS: "When pods in namespace octopus use service account octopus-server, give them credentials for OctoLauncherRole"
Pod Identity Agent injects credentials
- The DaemonSet exposes a node-local credential endpoint at
169.254.170.23:80(a link-local address, similar to how ECS task roles use169.254.170.2) - EKS injects the
AWS_CONTAINER_CREDENTIALS_FULL_URIenvironment variable into the pod, pointing at this endpoint - The AWS SDK's credential chain discovers this env var and fetches credentials from the agent automatically -- the pod never explicitly queries anything
- Under the hood, the agent calls the
eks-auth:AssumeRoleForPodIdentityAPI (notAssumeRoleWithWebIdentity) to broker the credentials
- The DaemonSet exposes a node-local credential endpoint at
AWS SDK automatically picks this up
- Same as IRSA from the application's perspective -- the SDK credential chain handles discovery transparently
- Calamari/SDK gets credentials without any explicit STS calls in application code
- This is the same
AWS_CONTAINER_CREDENTIALS_FULL_URImechanism that ECS Fargate uses for task roles, which is why the SDK treats EKS Pod Identity and ECS task roles identically from the application's perspective
Why Pod Identity is better:
- No OIDC provider registration - simpler setup
- No public OIDC discovery URL fetch - works reliably in fully private clusters. With IRSA, AWS STS must reach the cluster's OIDC discovery endpoint (
https://oidc.eks.<region>.amazonaws.com/id/<id>/.well-known/openid-configuration) to validate the pod's JWT. That URL resolves to a public IP address. In fully private EKS clusters with air-gapped VPCs (no NAT gateway, no internet gateway), this endpoint is unreachable -- STS cannot validate the token,AssumeRoleWithWebIdentityfails, and IRSA breaks entirely. Pod Identity sidesteps this because the on-node agent brokers credentials via theeks-auth:AssumeRoleForPodIdentityAPI, which travels over the AWS private network, not the public OIDC path. - Cleaner trust model - direct EKS service principal, no OIDC federation complexity
- Same developer experience - your code doesn't change
Private Cluster Warning: If you are running a fully private EKS cluster and cannot use Pod Identity (e.g., older EKS versions), see section 4.4 below for a Route 53 resolver workaround that allows IRSA to function by forwarding only the OIDC discovery domain to public DNS.
4.3 What This Gives You
Both IRSA and Pod Identity give Calamari a "launcher role" - an initial IAM role identity it can use.
This launcher role is like the "service principal that the agent uses" in Azure DevOps. But here's the key difference:
In Azure: Your pipeline agent's identity usually has the actual permissions it needs (Contributor, etc.)
In AWS: The launcher role typically has only one permission: sts:AssumeRole into other roles
Why? Because you want per-step, per-account, granular control over what each deployment step can do.
4.4 Route 53 Resolver Workaround: IRSA in Private Clusters
If you must use IRSA in a fully private EKS cluster (e.g., Pod Identity is unavailable on your EKS version), the core problem is that STS needs to reach the public OIDC discovery URL to validate the pod's JWT. You can solve this with Route 53 Resolver Endpoints and split-horizon DNS without opening general internet access:
- Create a Route 53 Outbound Resolver Endpoint in your VPC
- Create a forwarding rule that matches only the OIDC discovery domain (
oidc.eks.<region>.amazonaws.com) and forwards it to public DNS resolvers (e.g.,1.1.1.1,8.8.8.8) - All other DNS traffic continues to resolve via VPC-internal DNS and VPC endpoints as normal
# Create outbound resolver endpoint
aws route53resolver create-resolver-endpoint \
--creator-request-id oidc-resolver \
--direction OUTBOUND \
--security-group-ids sg-0123456789abcdef0 \
--ip-addresses SubnetId=subnet-aaa,Ip=10.0.1.10 SubnetId=subnet-bbb,Ip=10.0.2.10
# Create forwarding rule for OIDC domain only
aws route53resolver create-resolver-rule \
--creator-request-id oidc-forward \
--rule-type FORWARD \
--domain-name "oidc.eks.us-east-1.amazonaws.com" \
--resolver-endpoint-id rslvr-out-xxxxxxxxx \
--target-ips Ip=1.1.1.1 Ip=8.8.8.8
# Associate the rule with your VPC
aws route53resolver associate-resolver-rule \
--resolver-rule-id rslvr-rr-xxxxxxxxx \
--vpc-id vpc-xxxxxxxxx
This gives STS just enough DNS resolution to validate OIDC tokens while keeping everything else private. You still need a NAT gateway or AWS PrivateLink path for the actual HTTPS fetch of the OIDC discovery document -- the DNS forwarding alone resolves the name but does not route the traffic. In most cases, upgrading to Pod Identity is the cleaner long-term solution.
4.5 The Octopus Kubernetes Agent: Bypassing OIDC Entirely
For fully private clusters where neither Pod Identity nor the Route 53 workaround is viable, there is a fundamentally different architecture: install the Octopus Kubernetes Agent directly inside the cluster.
How it works:
- Install the agent via Helm into the target EKS cluster:
helm upgrade --install --atomic octopus-agent \
oci://registry-1.docker.io/octopusdeploy/kubernetes-agent \
--namespace octopus-agent \
--create-namespace \
--set agent.serverUrl="https://your-octopus-server" \
--set agent.serverCommsAddress="https://your-octopus-server:10943" \
--set agent.space="Default" \
--set agent.targetName="private-eks-cluster" \
--set agent.bearerToken="API-XXXXXXXXXXXX"
The agent runs in poll mode -- it dials outbound to Octopus Server over HTTPS (port 10943), asking "do you have work for me?" This means:
- No inbound connections to the cluster required
- No OIDC discovery URL validation needed
- No IRSA or Pod Identity configuration required for the Octopus→Kubernetes communication path
The agent already has in-cluster RBAC -- because it runs as a pod inside the cluster, it uses a Kubernetes service account with the RBAC permissions you grant it. Octopus Server sends deployment instructions; the agent executes them using its native Kubernetes access.
For AWS API calls, the agent pod can still use Pod Identity or IRSA to get a launcher role, then assume deployment roles per step -- the same two-layer pattern described in section 5. The difference is that the Octopus→cluster connectivity problem is eliminated.
When to use the Kubernetes Agent:
- Fully air-gapped clusters where no public DNS or internet path exists
- Multiple private clusters across accounts -- install one agent per cluster, all poll back to a central Octopus Server
- Simplified networking -- outbound-only connectivity from the cluster to Octopus Server
- Hybrid scenarios -- Octopus Server runs outside AWS (on-prem or different cloud) and deploys into private EKS clusters
Trade-off: You now manage an agent per cluster instead of having a single centralized Octopus-in-EKS installation. For organizations with many private clusters, this is often preferable to complex networking workarounds.
5. The Two-Layer Role Pattern: "Launcher" + "Deployment Roles"
Here's where AWS diverges significantly from the Azure mental model.
You want to be able to:
- Run Step A with CloudFormation access to deploy infrastructure in the current environment
- Run Step B with ECR access to push/pull container images in the current environment
- Run Step C with EKS access to apply Kubernetes manifests in the current environment
- All steps in one deployment run against the same environment — the environment selection (Dev, Staging, Prod) determines which AWS account is targeted
In Octopus Deploy, all steps in a single deployment execute against the same environment. When you deploy Release 1.0 to Dev, every step runs against Dev. When you promote Release 1.0 to Staging, every step runs against Staging. Per-step role ARNs are for different permission scopes within the same account (e.g., Step 1 needs CloudFormation, Step 2 needs ECR, Step 3 needs EKS — all in the same AWS account for that environment). Octopus variable scoping is the mechanism that changes which AWS account is targeted when you promote across environments.
Instead of giving your Octopus pod's identity all those permissions combined (which would be a security nightmare), you use role assumption chains.
5.1 Layer 1: The Launcher Role
This is the role attached to your Octopus pod via IRSA or Pod Identity. Octopus runs in the Dev account — the same AWS account as your Dev environment. When deploying to Dev, the launcher and deployment resources are in the same account. When promoting to Staging or Prod, those are separate AWS accounts requiring cross-account AssumeRole.
Example:
Account: Dev Account (111111111111) - where Octopus EKS cluster runs
Role: OctoLauncherRole
ARN: arn:aws:iam::111111111111:role/OctoLauncherRole
Trust Policy (Pod Identity):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "pods.eks.amazonaws.com"
},
"Action": ["sts:AssumeRole", "sts:TagSession"]
}]
}
Permission Policy (minimal):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::111111111111:role/DevDeployRole",
"arn:aws:iam::222222222222:role/StagingDeployRole",
"arn:aws:iam::333333333333:role/ProdDeployRole"
]
}]
}
This role: - Can't touch any actual AWS resources (no EKS, S3, CloudFormation permissions) - Can only assume other specific roles - Acts as the "bootstrap identity" for Calamari
Think of it like a "service principal that can only impersonate other service principals"
5.2 Layer 2: Deployment Roles (Per Account/Environment)
Now you create deployment roles in each target AWS account. Because Octopus runs in the Dev account, the Dev deployment role is in the same account as the launcher — this is a same-account AssumeRole. Staging and Prod are separate accounts and require cross-account role trust.
How Octopus selects the right role: Octopus variable scoping drives this. You define a variable AWS.DeployRoleArn (or similar) with different values scoped to each environment. When you deploy to Dev, Octopus resolves the Dev role ARN. When you promote the same release to Staging, Octopus resolves the Staging role ARN. The deployment process definition is identical — the environment selection is what changes the target account.
Dev Account (111111111111) — Same Account as Launcher
Role: DevDeployRole
ARN: arn:aws:iam::111111111111:role/DevDeployRole
Trust Policy:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
},
"Action": "sts:AssumeRole"
}]
}
This says: "Allow the OctoLauncherRole from the same Dev account (111111111111) to assume me"
Permission Policy (what Dev steps can actually do):
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EKSAccess",
"Effect": "Allow",
"Action": [
"eks:DescribeCluster",
"eks:ListClusters"
],
"Resource": "arn:aws:eks:us-east-1:111111111111:cluster/*"
},
{
"Sid": "ECRPushPull",
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Sid": "ECRRepoAccess",
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload",
"ecr:DescribeRepositories",
"ecr:DescribeImages",
"ecr:ListImages"
],
"Resource": "arn:aws:ecr:us-east-1:111111111111:repository/*"
},
{
"Sid": "CloudFormation",
"Effect": "Allow",
"Action": [
"cloudformation:CreateStack",
"cloudformation:UpdateStack",
"cloudformation:DeleteStack",
"cloudformation:DescribeStacks",
"cloudformation:DescribeStackEvents",
"cloudformation:GetTemplate",
"cloudformation:ValidateTemplate",
"cloudformation:ListStacks"
],
"Resource": "arn:aws:cloudformation:us-east-1:111111111111:stack/dev-*/*"
},
{
"Sid": "S3ArtifactAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::dev-artifacts-*",
"arn:aws:s3:::dev-artifacts-*/*"
]
}
]
}
Note on eks:DescribeCluster: This is the only IAM permission needed for kubectl operations. When Calamari runs kubectl apply, it calls eks:DescribeCluster to get the cluster's API endpoint and CA certificate, then authenticates to the Kubernetes API server using the IAM role. But IAM permissions alone are not sufficient -- you must also grant the role Kubernetes RBAC access (see section 5.4 below).
Staging Account (222222222222)
Role: StagingDeployRole
ARN: arn:aws:iam::222222222222:role/StagingDeployRole
Trust Policy (cross-account):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
},
"Action": "sts:AssumeRole"
}]
}
Permission Policy: Same structure as Dev, scoped to this account's resources:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EKSAccess",
"Effect": "Allow",
"Action": [
"eks:DescribeCluster",
"eks:ListClusters"
],
"Resource": "arn:aws:eks:us-east-1:222222222222:cluster/*"
},
{
"Sid": "ECRPushPull",
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Sid": "ECRRepoAccess",
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer",
"ecr:PutImage",
"ecr:InitiateLayerUpload",
"ecr:UploadLayerPart",
"ecr:CompleteLayerUpload",
"ecr:DescribeRepositories",
"ecr:DescribeImages",
"ecr:ListImages"
],
"Resource": "arn:aws:ecr:us-east-1:222222222222:repository/*"
},
{
"Sid": "CloudFormation",
"Effect": "Allow",
"Action": [
"cloudformation:CreateStack",
"cloudformation:UpdateStack",
"cloudformation:DeleteStack",
"cloudformation:DescribeStacks",
"cloudformation:DescribeStackEvents",
"cloudformation:GetTemplate",
"cloudformation:ValidateTemplate",
"cloudformation:ListStacks"
],
"Resource": "arn:aws:cloudformation:us-east-1:222222222222:stack/staging-*/*"
},
{
"Sid": "S3ArtifactAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::staging-artifacts-*",
"arn:aws:s3:::staging-artifacts-*/*"
]
}
]
}
The Staging deployment role has the same permission actions as Dev -- because the deployment process is the same. The difference is resource scoping (account 222222222222 resources) and the trust policy (cross-account from the Dev account where Octopus runs).
Production Account (333333333333)
Role: ProdDeployRole
ARN: arn:aws:iam::333333333333:role/ProdDeployRole
Trust Policy (cross-account):
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
},
"Action": "sts:AssumeRole"
}]
}
Permission Policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EKSAccess",
"Effect": "Allow",
"Action": [
"eks:DescribeCluster",
"eks:ListClusters"
],
"Resource": "arn:aws:eks:us-east-1:333333333333:cluster/*"
},
{
"Sid": "ECRPullOnly",
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken"
],
"Resource": "*"
},
{
"Sid": "ECRRepoAccess",
"Effect": "Allow",
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer",
"ecr:DescribeRepositories",
"ecr:DescribeImages",
"ecr:ListImages"
],
"Resource": "arn:aws:ecr:us-east-1:333333333333:repository/*"
},
{
"Sid": "CloudFormation",
"Effect": "Allow",
"Action": [
"cloudformation:CreateStack",
"cloudformation:UpdateStack",
"cloudformation:DescribeStacks",
"cloudformation:DescribeStackEvents",
"cloudformation:GetTemplate",
"cloudformation:ValidateTemplate",
"cloudformation:ListStacks"
],
"Resource": "arn:aws:cloudformation:us-east-1:333333333333:stack/prod-*/*"
},
{
"Sid": "S3ArtifactAccess",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::prod-artifacts-*",
"arn:aws:s3:::prod-artifacts-*/*"
]
}
]
}
The Prod role must have write permissions for the deployment process to work. If all steps in a deployment target the same environment, and the deployment process is the same for Dev and Prod, then Prod needs the ability to actually deploy -- create/update CloudFormation stacks, apply Kubernetes manifests via kubectl, etc. The control point for Prod safety is not IAM read-only permissions (which would make automated deployment impossible). Instead, Prod safety comes from:
- Octopus manual approval gates -- require a human to approve before a release proceeds to Prod
- Kubernetes RBAC scoping -- limit the role to specific namespaces
- CloudFormation stack policies -- prevent deletion of critical resources
- ECR pull-only -- Prod doesn't push images; it pulls images that were pushed in Dev/Staging. (Prod's ECR access is read-only, unlike Dev/Staging which has push permissions.)
- No CloudFormation DeleteStack -- notice Prod lacks
cloudformation:DeleteStackcompared to Dev/Staging - S3 read-only -- Prod reads artifacts; it doesn't produce them
5.3 Why This Pattern?
This is defense in depth:
- Pod compromise - if the Octopus pod is compromised, attacker only has OctoLauncherRole (in Dev account), which can only assume specific deployment roles and cannot directly touch resources
- Blast radius - each deployment role is scoped to exactly what that environment needs; Dev is same-account, Staging/Prod require explicit cross-account trust
- Audit trail - CloudTrail shows exact role assumption chain:
OctoLauncherRole->StagingDeployRole->s3:PutObject; for Dev deployments the chain stays within account 111111111111 - Granular control - Dev/Staging have full deploy permissions, Prod is tightened (no ECR push, no stack deletion, no artifact writes, namespace-scoped RBAC), all from the same Octopus installation
- Environment-driven targeting - Octopus variable scoping ensures the same deployment process resolves to the right role ARN per environment without any code changes
5.4 The kubectl/RBAC Gap: IAM Is Not Enough
This is a critical gap that catches teams by surprise: IAM permissions alone do not grant Kubernetes API access. Your deployment role can have eks:DescribeCluster and every EKS permission in the IAM catalog, but kubectl apply will still fail with error: You must be logged in to the server (Unauthorized) unless the role is also mapped to Kubernetes RBAC.
EKS has two mechanisms for this:
Option 1: EKS Access Entries (Recommended -- newer clusters)
EKS access entries are the AWS-native approach, managed via API without touching cluster internals:
# Grant the Dev deployment role access to the Dev cluster
aws eks create-access-entry \
--cluster-name dev-cluster \
--principal-arn arn:aws:iam::111111111111:role/DevDeployRole \
--type STANDARD
# Associate a Kubernetes RBAC policy
aws eks associate-access-policy \
--cluster-name dev-cluster \
--principal-arn arn:aws:iam::111111111111:role/DevDeployRole \
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
--access-scope type=namespace,namespaces=app-dev
For Staging/Prod (cross-account), you create access entries in those clusters pointing to the respective deployment roles:
# In Staging cluster (account 222222222222)
aws eks create-access-entry \
--cluster-name staging-cluster \
--principal-arn arn:aws:iam::222222222222:role/StagingDeployRole \
--type STANDARD
aws eks associate-access-policy \
--cluster-name staging-cluster \
--principal-arn arn:aws:iam::222222222222:role/StagingDeployRole \
--policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
--access-scope type=namespace,namespaces=app-staging
Option 2: aws-auth ConfigMap (Legacy -- all clusters)
For older clusters or clusters not using access entries, you map IAM roles to Kubernetes groups via the aws-auth ConfigMap in the kube-system namespace:
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
# Deployment role for this environment
- rolearn: arn:aws:iam::111111111111:role/DevDeployRole
username: octopus-deploy
groups:
- system:masters # Full cluster admin -- tighten this in Staging/Prod
# Node instance role (already present -- don't remove)
- rolearn: arn:aws:iam::111111111111:role/eks-node-role
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
Warning: Editing
aws-authincorrectly can lock you out of the cluster. Always verify the existing content before modifying. Never remove the node instance role entries.
For Prod, use namespace-scoped RBAC instead of system:masters:
# ClusterRole for deployment (apply to specific namespace)
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: app-prod
name: octopus-deployer
rules:
- apiGroups: ["", "apps", "batch"]
resources: ["deployments", "services", "configmaps", "secrets", "pods", "jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
namespace: app-prod
name: octopus-deployer-binding
subjects:
- kind: Group
name: octopus-deployers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: octopus-deployer
apiGroup: rbac.authorization.k8s.io
Then in aws-auth, map the Prod role to the octopus-deployers group instead of system:masters.
The two-layer authorization model: IAM controls whether the role can reach the cluster (eks:DescribeCluster). Kubernetes RBAC controls what the role can do once authenticated. You need both. This is fundamentally different from Azure AKS, where Azure AD RBAC can grant both the Azure-level and Kubernetes-level permissions in one place.
6. The Big Picture: How It All Connects
These diagrams show the two deployment scenarios: deploying to the same account where Octopus runs (Dev), and deploying cross-account (Prod). In both cases, all steps in the deployment target the same environment -- the environment selection determines which account is targeted.
Deploying to Dev (Same Account)
When you deploy a release to Dev, Octopus, the launcher role, and the deployment role are all in the same AWS account. The AssumeRole call is same-account:
flowchart TB
subgraph org["AWS Organization"]
subgraph dev["Dev Account (111111111111) — Octopus runs here"]
sts["AWS STS"]
subgraph eks["EKS Cluster"]
subgraph pod["Octopus Pod"]
octopus["Octopus Server"]
calA["Calamari A\n(CloudFormation step)"]
calB["Calamari B\n(EKS deploy step)"]
end
end
irsa["IRSA or Pod Identity\n(OIDC Issuer)"]
devRole["DevDeployRole"]
devResources["Dev Resources\n(EKS, ECR, S3, CloudFormation)"]
irsa -->|"env vars:\nAWS_WEB_IDENTITY_TOKEN_FILE\nAWS_ROLE_ARN"| pod
pod -->|"AssumeRoleWithWebIdentity\n(launcher token)"| sts
sts -->|"Temp creds:\nOctoLauncherRole"| pod
octopus -->|"spawns"| calA
octopus -->|"spawns"| calB
calA -->|"sts:AssumeRole\n(DevDeployRole)"| sts
sts -->|"Temp creds"| calA
calA -.->|"operates on"| devResources
devResources --- devRole
calB -->|"sts:AssumeRole\n(DevDeployRole)"| sts
sts -->|"Temp creds"| calB
calB -.->|"operates on"| devResources
end
end
Both Calamari steps assume the same DevDeployRole because all steps in this deployment target Dev. Per-step role overrides are still possible if different steps need different permission scopes within Dev (e.g., one step needs CloudFormation + IAM, another only needs ECR read).
Promoting to Prod (Cross-Account)
When you promote the same release to Prod, Octopus variable scoping resolves the Prod role ARN instead. Calamari now makes cross-account AssumeRole calls from the Dev account into the Prod account:
flowchart TB
subgraph org["AWS Organization"]
subgraph dev["Dev Account (111111111111) — Octopus runs here"]
sts["AWS STS"]
subgraph eks["EKS Cluster"]
subgraph pod["Octopus Pod"]
octopus["Octopus Server"]
calA["Calamari A\n(CloudFormation step)"]
calB["Calamari B\n(EKS deploy step)"]
end
end
irsa["IRSA or Pod Identity"]
irsa -->|"launcher token"| pod
pod -->|"AssumeRoleWithWebIdentity"| sts
sts -->|"OctoLauncherRole creds"| pod
end
subgraph prod["Prod Account (333333333333)"]
prodRole["ProdDeployRole"]
prodResources["Prod Resources\n(EKS, ECR, S3)"]
end
octopus -->|"spawns"| calA
octopus -->|"spawns"| calB
calA -->|"sts:AssumeRole\n(ProdDeployRole)"| sts
sts -->|"Cross-account\ntemp creds"| calA
calA -.->|"operates on"| prodResources
prodResources --- prodRole
calB -->|"sts:AssumeRole\n(ProdDeployRole)"| sts
sts -->|"Cross-account\ntemp creds"| calB
calB -.->|"operates on"| prodResources
end
Important: Calamari processes run inside the Octopus pod in the Dev account -- they are subprocesses of the Octopus Server, not remote agents deployed in target accounts. They assume IAM roles in the target accounts via STS and then make API calls to those accounts, but the process itself executes locally in the Octopus pod. If you need actual execution inside a target account's VPC (e.g., for private API endpoints), you'd deploy external workers there -- but that's a different topology.
The key insight: the deployment process is identical for Dev and Prod. What changes is the environment selection, which causes Octopus to resolve different variable values -- including the target role ARN. Octopus Server spawns Calamari subprocesses, each of which independently calls STS using the pod's launcher credentials, then assumes the deployment role that the environment's variable scoping resolves to.
7. How Octopus Configuration Maps to This
In Octopus UI under Deploy -> Manage -> Accounts -> Add Account -> AWS Account, you configure a single account that uses ambient credentials from IRSA/Pod Identity:
AWS Account Configuration
Account name: AWS Deploy
Authentication method: Execute using the AWS service role for an EC2 instance (This tells Octopus: "Don't use stored keys; Calamari should pick up ambient credentials from IRSA/Pod Identity")
Note on the label: The "EC2 instance" wording is misleading -- this option does not require EC2. It means "use the AWS SDK's default credential chain to resolve ambient credentials," which works equally for EKS IRSA, EKS Pod Identity, ECS Task Roles, and EC2 Instance Roles. The label predates EKS and ECS Fargate support in Octopus. Functionally, selecting this just tells Calamari: "don't use stored access keys; discover credentials from the environment."
Access Key / Secret Key: (leave blank)
Assume Role (optional): (leave blank at account level)
Variable Scoping: How Environments Target Different Accounts
This is the key to understanding how Octopus handles multi-account deployments. You define a project variable for the deployment role ARN, with different values scoped to each environment:
| Variable Name | Value | Scoped To |
|---------------|-------|-----------|
| AWS.DeployRoleArn | arn:aws:iam::111111111111:role/DevDeployRole | Dev |
| AWS.DeployRoleArn | arn:aws:iam::222222222222:role/StagingDeployRole | Staging |
| AWS.DeployRoleArn | arn:aws:iam::333333333333:role/ProdDeployRole | Prod |
Then in your deployment process, each step uses:
AWS Account: Select AWS Deploy
Assume a different AWS Role: #{AWS.DeployRoleArn}
When you deploy Release 1.0 to Dev, Octopus resolves #{AWS.DeployRoleArn} to the Dev role ARN. When you promote the same release to Staging, Octopus resolves it to the Staging role ARN. The deployment process definition is identical across environments -- the environment selection is what changes the target account.
This tells Calamari:
1. Use ambient credentials from Pod Identity (OctoLauncherRole in Dev account)
2. Call sts:AssumeRole into whichever role ARN the environment resolved
3. For Dev: same-account AssumeRole (both launcher and target in 111111111111)
4. For Staging/Prod: cross-account AssumeRole (launcher in 111111111111, target in 222222222222 or 333333333333)
5. Use those scoped credentials for the step
Alternative: Octopus OIDC (If You Don't Want IRSA/Pod Identity)
Instead of "Execute using service role", you can configure:
Authentication method: Use OpenID Connect
Role ARN: arn:aws:iam::111111111111:role/OctoDevRole
In this mode:
- Octopus Server acts as an OIDC issuer
- Octopus mints a JWT token scoped to the deployment
- Calamari calls sts:AssumeRoleWithWebIdentity using Octopus's JWT
- AWS STS validates the token by fetching https://your-octopus-server/.well-known/openid-configuration
Why you might not want this: Requires Octopus to have a publicly-reachable OIDC discovery endpoint. If Octopus is fully private, STS can't validate the token.
When IRSA/Pod Identity is better: Your Octopus installation can be completely private. The EKS OIDC issuer (for IRSA) or Pod Identity service is AWS-managed and public, so STS can always validate.
8. The Complete Flow: Step Execution with Role Assumption
Let's trace a real deployment step end-to-end:
Scenario: Deploy a CloudFormation stack to Dev account (same account where Octopus runs)
Configuration:
- Octopus runs in EKS cluster in Dev account (111111111111)
- Octopus pod uses service account with Pod Identity -> OctoLauncherRole
- Step configured with AWS Account AWS Deploy, Role ARN resolved via variable scoping to arn:aws:iam::111111111111:role/DevDeployRole
Step-by-step execution:
User triggers deployment in Octopus UI
Octopus Server evaluates the step
- Identifies that it should run on built-in worker (in the Octopus pod)
- Spawns Calamari subprocess
Octopus Server passes to Calamari:
- CloudFormation template file
- Stack name, parameters
- AWS Account config: "use ambient service role"
- Per-step Role ARN:
arn:aws:iam::111111111111:role/DevDeployRole
Calamari resolves base credentials:
- Checks environment variables
- Finds
AWS_ROLE_ARN=arn:aws:iam::111111111111:role/OctoLauncherRole(injected by Pod Identity) - Finds
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/...(or Pod Identity agent endpoint) - AWS SDK automatically calls STS to get credentials for OctoLauncherRole
- Calamari now has temp creds for the launcher role
Calamari performs role assumption:
- Using OctoLauncherRole credentials, calls:
aws sts assume-role \
--role-arn arn:aws:iam::111111111111:role/DevDeployRole \
--role-session-name octopus-deploy-12345
- STS checks: "Does DevDeployRole trust OctoLauncherRole?" -> Yes (trust policy)
- STS checks: "Can OctoLauncherRole assume DevDeployRole?" -> Yes (launcher has
sts:AssumeRolepermission for this ARN) STS returns temporary credentials for DevDeployRole (same-account)
- Calamari injects credentials:
export AWS_ACCESS_KEY_ID=ASIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export AWS_SESSION_TOKEN=AQoEXAMPLEH4aoAH0gNCAPyJxz4BlCFFxWNE1OPTgk5TthT+FvwqnKwRcOIfrRh3c/...
export AWS_DEFAULT_REGION=us-east-1
- CloudFormation step executes:
aws cloudformation deploy \
--template-file template.yaml \
--stack-name my-app-stack \
--capabilities CAPABILITY_IAM
- AWS CLI uses the injected credentials
- Operates as DevDeployRole in account 111111111111
Can create CloudFormation stacks, EKS clusters, etc. (per DevDeployRole permissions)
- Step completes, credentials discarded
- Temporary credentials expire (default 1 hour, configurable up to 12 hours; role chaining limited to 1 hour)
- Next step goes through the same flow, potentially with different role
What Happens in CloudTrail (Audit)
When you look at CloudTrail logs:
In Dev Account (111111111111) -- where Octopus runs:
{
"eventName": "AssumeRole",
"requestParameters": {
"roleArn": "arn:aws:iam::111111111111:role/DevDeployRole",
"roleSessionName": "octopus-deploy-12345"
},
"userIdentity": {
"type": "AssumedRole",
"principalId": "AIDACKCEVSQ6C2EXAMPLE:octopus-pod",
"arn": "arn:aws:sts::111111111111:assumed-role/OctoLauncherRole/octopus-pod"
}
}
In Dev Account (111111111111):
{
"eventName": "CreateStack",
"requestParameters": {
"stackName": "my-app-stack",
"templateURL": "https://..."
},
"userIdentity": {
"type": "AssumedRole",
"principalId": "AIDACKCEVSQ6C2EXAMPLE:octopus-deploy-12345",
"arn": "arn:aws:sts::111111111111:assumed-role/DevDeployRole/octopus-deploy-12345"
},
"sourceIPAddress": "10.0.5.23"
}
You can trace the entire chain: pod -> OctoLauncherRole -> DevDeployRole -> CloudFormation action.
9. Multi-Account Strategy: How Many Roles?
When you have multiple microservices deploying to multiple environments, you need to decide: how many deployment roles?
Option 1: One Deployment Role Per Environment (Simplest)
Dev Account (111111111111) -- Octopus runs here
+-- OctoLauncherRole
+-- DevDeployRole (all 8 microservices use this)
Staging Account (222222222222)
+-- StagingDeployRole
Prod Account (333333333333)
+-- ProdDeployRole
Total: 4 roles (launcher + 3 deployment)
Pros: - Simple to manage - Fast iteration in Dev/Staging - One Octopus AWS Account config per environment
Cons: - Every microservice deployment has access to all resources in the account - No per-service blast radius control - Harder to audit "which service did what"
Option 2: One Role Per Microservice Per Environment (Maximum Isolation)
Dev Account (111111111111)
+-- UserServiceDevRole
+-- PaymentServiceDevRole
+-- NotificationServiceDevRole
+-- AuthServiceDevRole
+-- InventoryServiceDevRole
+-- OrderServiceDevRole
+-- ShippingServiceDevRole
+-- AnalyticsServiceDevRole
Staging Account (222222222222)
+-- (same 8 roles)
Prod Account (333333333333)
+-- (same 8 roles)
Total: 8 microservices x 3 environments = 24 deployment roles (plus 1 launcher in Dev = 25 total)
Pros: - Perfect least privilege - UserService can't touch PaymentService resources - Compromised role only affects one service - Clear audit trail per service
Cons: - 25 roles to manage (permission drift risk) - More Octopus configuration (8 AWS Accounts per environment, or 8 per-step role ARN overrides)
Option 3: Hybrid - Group by Blast Radius (Recommended)
Dev Account
+-- DevDeployRole (all services)
Staging Account
+-- StagingDeployRole (all services)
Prod Account
+-- ProdDataPlaneRole (low-risk: users, inventory, shipping, analytics, notifications)
+-- ProdControlPlaneRole (high-risk: payments, auth, orders)
Total: 5 roles
ProdControlPlaneRole has highly restricted permissions + requires manual approval in Octopus before use.
Pros: - Balance between security and maintainability - Production gets extra protection where it matters - Dev/Staging stay simple for velocity
Cons: - Still some shared blast radius in Prod data plane
Automation: AWS CloudFormation StackSets
To avoid manually creating roles in each account, use StackSets:
Template: deploy-role.yaml
Parameters:
Environment:
Type: String
AllowedValues: [Dev, Staging, Prod]
LauncherRoleArn:
Type: String
Description: ARN of OctoLauncherRole in account where Octopus runs
AllowECRPush:
Type: String
AllowedValues: ['true', 'false']
Default: 'true'
Description: Whether this role can push to ECR (false for Prod)
AllowCFNDelete:
Type: String
AllowedValues: ['true', 'false']
Default: 'true'
Description: Whether this role can delete CloudFormation stacks (false for Prod)
Conditions:
CanPushECR: !Equals [!Ref AllowECRPush, 'true']
CanDeleteCFN: !Equals [!Ref AllowCFNDelete, 'true']
Resources:
DeployRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub '${Environment}DeployRole'
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
AWS: !Ref LauncherRoleArn
Action: sts:AssumeRole
DeployPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: !Sub '${Environment}DeployPolicy'
Roles: [!Ref DeployRole]
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: EKSAccess
Effect: Allow
Action:
- eks:DescribeCluster
- eks:ListClusters
Resource: !Sub 'arn:aws:eks:${AWS::Region}:${AWS::AccountId}:cluster/*'
- Sid: ECRAuth
Effect: Allow
Action:
- ecr:GetAuthorizationToken
Resource: '*'
- Sid: ECRPull
Effect: Allow
Action:
- ecr:BatchCheckLayerAvailability
- ecr:BatchGetImage
- ecr:GetDownloadUrlForLayer
- ecr:DescribeRepositories
- ecr:DescribeImages
- ecr:ListImages
Resource: !Sub 'arn:aws:ecr:${AWS::Region}:${AWS::AccountId}:repository/*'
- Sid: CloudFormationReadWrite
Effect: Allow
Action:
- cloudformation:CreateStack
- cloudformation:UpdateStack
- cloudformation:DescribeStacks
- cloudformation:DescribeStackEvents
- cloudformation:GetTemplate
- cloudformation:ValidateTemplate
- cloudformation:ListStacks
Resource: !Sub 'arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Environment}-*/*'
- Sid: S3ArtifactRead
Effect: Allow
Action:
- s3:GetObject
- s3:ListBucket
Resource:
- !Sub 'arn:aws:s3:::${Environment}-artifacts-*'
- !Sub 'arn:aws:s3:::${Environment}-artifacts-*/*'
# Conditional: ECR push (Dev/Staging only, not Prod)
ECRPushPolicy:
Type: AWS::IAM::Policy
Condition: CanPushECR
Properties:
PolicyName: !Sub '${Environment}ECRPushPolicy'
Roles: [!Ref DeployRole]
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: ECRPush
Effect: Allow
Action:
- ecr:PutImage
- ecr:InitiateLayerUpload
- ecr:UploadLayerPart
- ecr:CompleteLayerUpload
Resource: !Sub 'arn:aws:ecr:${AWS::Region}:${AWS::AccountId}:repository/*'
# Conditional: S3 artifact write (Dev/Staging only)
S3WritePolicy:
Type: AWS::IAM::Policy
Condition: CanPushECR
Properties:
PolicyName: !Sub '${Environment}S3WritePolicy'
Roles: [!Ref DeployRole]
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: S3ArtifactWrite
Effect: Allow
Action:
- s3:PutObject
Resource: !Sub 'arn:aws:s3:::${Environment}-artifacts-*/*'
# Conditional: CloudFormation delete (Dev/Staging only)
CFNDeletePolicy:
Type: AWS::IAM::Policy
Condition: CanDeleteCFN
Properties:
PolicyName: !Sub '${Environment}CFNDeletePolicy'
Roles: [!Ref DeployRole]
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: CFNDelete
Effect: Allow
Action:
- cloudformation:DeleteStack
Resource: !Sub 'arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Environment}-*/*'
Outputs:
DeployRoleArn:
Value: !GetAtt DeployRole.Arn
Description: ARN for Octopus variable scoping
Deploy to all accounts with environment-appropriate permissions:
# Create the StackSet
aws cloudformation create-stack-set \
--stack-set-name deployment-roles \
--template-body file://deploy-role.yaml \
--parameters \
ParameterKey=LauncherRoleArn,ParameterValue=arn:aws:iam::111111111111:role/OctoLauncherRole \
--capabilities CAPABILITY_NAMED_IAM
# Dev (same account as Octopus -- full permissions)
aws cloudformation create-stack-instances \
--stack-set-name deployment-roles \
--accounts 111111111111 \
--regions us-east-1 \
--parameter-overrides \
ParameterKey=Environment,ParameterValue=Dev \
ParameterKey=AllowECRPush,ParameterValue=true \
ParameterKey=AllowCFNDelete,ParameterValue=true
# Staging (cross-account -- full permissions)
aws cloudformation create-stack-instances \
--stack-set-name deployment-roles \
--accounts 222222222222 \
--regions us-east-1 \
--parameter-overrides \
ParameterKey=Environment,ParameterValue=Staging \
ParameterKey=AllowECRPush,ParameterValue=true \
ParameterKey=AllowCFNDelete,ParameterValue=true
# Prod (cross-account -- no ECR push, no CFN delete)
aws cloudformation create-stack-instances \
--stack-set-name deployment-roles \
--accounts 333333333333 \
--regions us-east-1 \
--parameter-overrides \
ParameterKey=Environment,ParameterValue=Prod \
ParameterKey=AllowECRPush,ParameterValue=false \
ParameterKey=AllowCFNDelete,ParameterValue=false
The template uses CloudFormation conditions to vary permissions by environment. Prod gets the same base deploy permissions (it must be able to apply CloudFormation and reach EKS) but cannot push images, delete stacks, or write artifacts. Updates to the template propagate to all accounts automatically.
10. What Changes When Octopus Runs Elsewhere?
The pattern is the same whether Octopus runs in EKS, ECS Fargate, EC2, or even on-premises. Only the Layer 1 (launcher identity mechanism) changes.
Octopus in ECS Fargate
No IRSA/Pod Identity (those are EKS features)
Instead: ECS Task Role
Task Role vs Execution Role -- They Are Different Things
ECS task definitions have two role fields, and confusing them is a common source of "why can't my container call AWS APIs" issues:
Task Role (
taskRoleArn) -- This is the IAM role that your application code (Octopus/Calamari) uses at runtime to call AWS APIs. This is your launcher role. Credentials are injected into the container via theAWS_CONTAINER_CREDENTIALS_RELATIVE_URIenvironment variable, which points to the ECS agent's local metadata endpoint athttp://169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI.Execution Role (
executionRoleArn) -- This is the IAM role that the ECS agent uses to pull your container image from ECR, send logs to CloudWatch, and retrieve secrets from Secrets Manager or SSM Parameter Store. Your application code never sees or uses this role. It needsecr:GetAuthorizationToken,ecr:BatchGetImage,logs:CreateLogStream,logs:PutLogEvents, and optionallysecretsmanager:GetSecretValueorssm:GetParameters.
The key distinction: The execution role is for ECS infrastructure operations (pull image, push logs). The task role is for your application's AWS API calls. For the Octopus launcher pattern, only the task role matters -- it becomes the launcher identity that Calamari uses to assume deployment roles.
How ECS Credential Injection Works
- When your ECS task starts, the ECS agent sets
AWS_CONTAINER_CREDENTIALS_RELATIVE_URIin the container environment (e.g.,/v2/credentials/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) - The AWS SDK's credential chain detects this env var and makes an HTTP GET to
http://169.254.170.2${AWS_CONTAINER_CREDENTIALS_RELATIVE_URI} - The ECS agent responds with temporary credentials (access key, secret key, session token) for the task role
- These credentials auto-refresh -- the SDK handles rotation transparently
Note: The ECS metadata endpoint is at 169.254.170.2 (link-local), which is different from the EC2 IMDS at 169.254.169.254. If you have code that hardcodes the EC2 metadata IP, it won't work on Fargate.
Setup
- Create IAM role with trust policy for
ecs-tasks.amazonaws.com - Assign as the task role in ECS task definition
- ECS injects credentials via
AWS_CONTAINER_CREDENTIALS_RELATIVE_URIenv var - Calamari picks this up via AWS SDK credential chain
- Rest is identical: Calamari uses task role -> assumes deployment roles per step
Trust policy:
{
"Statement": [{
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}
ECS task definition:
{
"family": "octopus-server",
"taskRoleArn": "arn:aws:iam::123456789012:role/OctoLauncherRole",
"executionRoleArn": "arn:aws:iam::123456789012:role/OctopusECSExecutionRole",
"containerDefinitions": [...]
}
ECS-Specific Gotchas
- No
169.254.169.254on Fargate: The EC2 instance metadata service (IMDS) is not available. If any library or script tries to hit the EC2 metadata endpoint, it will time out. Only the ECS credential endpoint at169.254.170.2is available. - VPC configuration matters: Fargate tasks need network access to STS (for
AssumeRolecalls). Either place tasks in a subnet with a NAT gateway, or create a VPC endpoint forcom.amazonaws.<region>.sts. - Secrets in environment variables: Use the execution role + Secrets Manager/SSM integration to inject secrets into the container at launch, rather than baking them into the task definition. The execution role handles the secret retrieval before your container starts.
Octopus on EC2
Use EC2 Instance Role
- Create IAM role with trust policy for
ec2.amazonaws.com - Attach to EC2 instance via instance profile
- Calamari picks up via IMDS (Instance Metadata Service) at
http://169.254.169.254/latest/meta-data/iam/security-credentials/OctoLauncherRole - Rest is identical
Octopus On-Premises (No Ambient Credentials)
Option 1: Use Octopus OIDC
- Octopus acts as OIDC issuer
- Must expose /.well-known/openid-configuration publicly
- Calamari uses Octopus-minted JWT -> AssumeRoleWithWebIdentity
Option 2: Use External Worker in AWS - Octopus Server on-prem orchestrates - Steps run on workers in AWS (EC2/ECS with instance/task roles) - Workers have launcher role, same pattern applies
11. The Conceptual Shift from Azure
This is the hardest part to internalize if you're coming from Azure:
Azure Mindset
"This pipeline runs as this service principal / managed identity; that principal has these permissions on these resources."
The identity is relatively static for the entire pipeline run. You might use multiple service connections, but each stage/job has one identity.
AWS + Octopus Mindset
"This step, at runtime, will assume this role in that account, do its work, and then the credentials expire."
The identity is dynamic per step. The Octopus pod/task has one minimal identity (launcher) that can't do anything itself -- it can only become other identities via AssumeRole.
Why This Is Powerful
Distributed runtime orchestration: - Octopus Server is pure orchestration -- no AWS permissions needed on the server itself - Calamari handles credential resolution and STS calls per step - Each step gets exactly the permissions it needs, no more - Cross-account is native -- no special configuration needed - Audit trail shows exact role -> role -> action chain
Fine-grained control: - Dev steps: broad permissions for fast iteration - Staging steps: similar to Dev, maybe with extra validations - Prod steps: read-only + manual approval gates - All from one Octopus installation with one pod identity
Defense in depth: - Pod compromise = attacker only has launcher role (can't touch resources) - Deployment role compromise = blast radius limited to that account/scope - Each AWS account owner controls their deployment role permissions - Centralized orchestration (Octopus) + distributed authorization (IAM per account)
12. Common Gotchas and How to Avoid Them
Gotcha 1: "My step says 'Access Denied' but the role has the permission"
Likely cause: You're looking at OctoLauncherRole permissions instead of the deployment role
Fix: Check CloudTrail in the target account to see which role the step actually assumed. Verify that role has the permission.
Gotcha 2: "AssumeRole fails with 'not authorized to perform sts:AssumeRole'"
Likely causes:
1. OctoLauncherRole doesn't have sts:AssumeRole permission for that target role ARN
2. Target role's trust policy doesn't allow OctoLauncherRole to assume it
3. Typo in role ARN
Fix: Check both the launcher role's permissions and the target role's trust policy.
Gotcha 3: "Step works in Dev but fails in Prod with same code"
Likely cause: ProdDeployRole has more restrictive permissions than DevDeployRole
Fix: This is by design. Check Prod role permissions and adjust or use manual approval + elevated role for Prod changes.
Gotcha 4: "IRSA/Pod Identity not working - Calamari can't find credentials"
Check:
1. Is the service account annotated correctly? (kubectl describe sa octopus-server -n octopus)
2. Are env vars injected in the pod? (kubectl exec -it <pod> -- env | grep AWS)
3. Is the pod using the right service account? (kubectl get pod <pod> -o yaml | grep serviceAccountName)
4. Does the IAM role trust policy match the exact service account and OIDC issuer?
Gotcha 5: "Cross-account AssumeRole works from AWS CLI but fails in Octopus"
Likely cause: External ID mismatch or session duration too long
Fix: - Don't use external IDs for Octopus role assumptions (not needed for service-to-service) - Check if target role has max session duration configured, ensure Octopus isn't requesting longer
Gotcha 6: "My deployment worked yesterday but fails today"
Likely cause: Temporary credentials expired and Calamari is reusing cached creds
Fix: This shouldn't happen -- Calamari calls STS per step. But check if you have any caching in custom scripts or environment variable exports that persist across steps.
13. Best Practices Summary
Security
- Launcher role has zero resource permissions - only
sts:AssumeRole - Deployment roles use least privilege - exactly what each environment needs
- Prod roles are read-only by default - write access requires manual approval or separate role
- Use Pod Identity over IRSA - simpler, more reliable in private clusters
- Enable CloudTrail in all accounts - track the full role assumption chain
- Create an STS VPC interface endpoint - deploy a
com.amazonaws.<region>.stsVPC endpoint so that allAssumeRoleandAssumeRoleWithWebIdentitycalls stay within the AWS private network and never traverse the public internet. This is defense-in-depth for sensitive environments and eliminates the need for a NAT gateway for STS traffic:
aws ec2 create-vpc-endpoint \
--vpc-id vpc-xxxxxxxxx \
--service-name com.amazonaws.us-east-1.sts \
--vpc-endpoint-type Interface \
--subnet-ids subnet-aaa subnet-bbb \
--security-group-ids sg-0123456789abcdef0 \
--private-dns-enabled
With --private-dns-enabled, the default sts.us-east-1.amazonaws.com hostname resolves to the private endpoint IP within your VPC. No SDK or application changes needed -- all STS calls automatically route privately.
- Use AWS Organizations Service Control Policies (SCPs) to enforce that deployment roles can only be assumed by your specific launcher role ARN. SCPs act at the Organizations level and override even account-admin IAM policies, providing an organizational security boundary that complements the per-role trust policies:
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "DenyAssumeDeployRolesExceptLauncher",
"Effect": "Deny",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::*:role/*DeployRole"
],
"Condition": {
"StringNotEquals": {
"aws:PrincipalArn": "arn:aws:iam::111111111111:role/OctoLauncherRole"
}
}
}]
}
This ensures that even if an account admin creates a permissive IAM policy, they cannot assume the deployment roles unless they are the designated launcher. Combine with trust policies for defense-in-depth.
SCP caveats: (1) SCPs do not apply to the management account in AWS Organizations -- if your launcher or deployment roles exist in the management account, this SCP has no effect there. Always place workloads in member accounts. (2) In a role chain (e.g., OctoLauncherRole assumes DevDeployRole),
aws:PrincipalArnreflects the calling role at each hop, not the original initiator. If your deployment steps involve further role chaining beyond the two-layer pattern, the SCP condition behavior can be surprising -- test the exact evaluation in your role chain before relying on this SCP as a sole control.
Operational
- Use StackSets to deploy roles - consistency across accounts, easy updates
- One Octopus AWS Account per environment - Dev, Staging, Prod configs
- Document role ARN in deployment process - make it clear which role each step uses
- Use descriptive role session names -
octopus-deploy-{deployment-id}helps in CloudTrail - Set reasonable session durations - default 1 hour is sensible for most steps; max is 12 hours but role chaining caps at 1 hour
Organizational
- Each AWS account owner controls their deployment role - central Octopus, distributed authorization
- Group microservices by blast radius - not every service needs its own role
- Start simple, add granularity as needed - one role per environment, split later if needed
- Use AWS Organizations - centralized billing, easier StackSet deployment
14. Quick Reference: Decision Trees
"Which launcher mechanism should I use?"
flowchart TD
A{"Where does\nOctopus run?"} -->|EKS| B{"Private cluster?"}
B -->|Yes| C{"Pod Identity\navailable?"}
C -->|Yes| D["EKS Pod Identity\n(recommended)"]
C -->|No| E{"Can add Route 53\nresolver for OIDC?"}
E -->|Yes| F["IRSA + Route 53\nresolver workaround"]
E -->|No| G["Octopus K8s Agent\n(poll mode, no OIDC needed)"]
B -->|No| H{"Existing OIDC\nprovider setup?"}
H -->|Yes| I["IRSA\n(works fine)"]
H -->|No| D
A -->|ECS Fargate| J["ECS Task Role"]
A -->|EC2| K["EC2 Instance Role\n(via Instance Profile)"]
A -->|On-Premises| L{"Can expose public\nOIDC endpoint?"}
L -->|Yes| M["Octopus OIDC"]
L -->|No| N["External Workers in AWS\nor K8s Agent (poll mode)"]
"How many deployment roles do I need?"
flowchart TD
A{"How many\nenvironments?"} --> B["Dev + Staging + Prod\n= 3 base roles"]
B --> C{"How many\nmicroservices?"}
C -->|"< 5"| D["Shared role per env\n(3 total + 1 launcher = 4)"]
C -->|"5-10"| E["Shared in Dev/Staging\nSplit Prod by risk\n(4-5 total)"]
C -->|"> 10"| F["Role per service or\ngroup by domain\n(10-20 total)"]
E --> G{"High-sensitivity\nworkloads?"}
G -->|"Payments, PII, Auth"| H["Dedicated role +\nmanual approval"]
G -->|"Analytics, Notifications"| I["Can share role"]
"My step is failing -- where do I look?"
flowchart TD
A["Step Failed"] --> B["Check Octopus\ndeployment log"]
B --> C{"Shows which role\nwas assumed?"}
C -->|Yes| D["Check CloudTrail\nin target account"]
C -->|No| E["Credential resolution\nfailed - check IRSA/\nPod Identity setup"]
D --> F{"AssumeRole\nsuccessful?"}
F -->|No| G["Check trust policy +\nlauncher permissions"]
F -->|Yes| H{"API call\nattempted?"}
H -->|Yes| I{"AccessDenied?"}
I -->|Yes| J["Check deployment\nrole permissions"]
I -->|No| K["Different error -\ncheck API params"]
H -->|No| L["Credential injection\nfailed - check Calamari logs"]
Conclusion
The AWS + Octopus + EKS pattern for multi-account deployments is more complex than Azure's managed identity model at first glance. But once you internalize the two-layer pattern -- launcher role + per-step deployment roles -- it becomes extremely powerful:
- Octopus orchestrates, but never holds deployment permissions itself
- Calamari resolves credentials dynamically per step via STS
- IRSA/Pod Identity provides the bootstrap launcher identity
- IAM roles per account encode exactly what each environment/service can do
- Cross-account is native -- no special setup, just trust policies
- Private clusters have options -- Pod Identity, Route 53 resolver workarounds, or the Octopus Kubernetes Agent in poll mode
- STS VPC endpoints and SCPs add defense-in-depth at the network and organization level
The mental shift from "this pipeline runs as this identity" to "this step will assume this role at runtime" unlocks:
- Fine-grained, per-step authorization
- Defense in depth (pod compromise != resource access)
- Distributed ownership (each account controls its deployment role)
- Centralized orchestration with decentralized permissions
Whether you use IRSA, Pod Identity, ECS Task Roles, EC2 instance roles, or the Octopus Kubernetes Agent as your launcher mechanism, the pattern remains the same. Master this model and multi-account, multi-environment AWS deployments become manageable, secure, and auditable.
Corrections
On april 1, 2026 the following corrections were applied based on technical review:
- Section 6 diagram: Calamari processes were incorrectly shown inside the Dev and Prod account VPCs. Corrected to show them inside the Octopus pod in the Dev account (where Octopus runs), since Calamari runs as a subprocess of Octopus Server and makes remote API calls to target accounts via STS-assumed credentials.
- Section 7 "EC2 instance" label: Added clarification that the "Execute using the AWS service role for an EC2 instance" option in Octopus UI is misleadingly named -- it actually means "use the SDK default credential chain" and works for IRSA, Pod Identity, and ECS Task Roles, not just EC2.
- Section 4.2 Pod Identity mechanics: Tightened the credential delivery description. The Pod Identity Agent exposes a local endpoint at
169.254.170.23:80, credentials are discovered viaAWS_CONTAINER_CREDENTIALS_FULL_URIenv var, and the SDK handles everything transparently -- pods don't explicitly query the agent. - Section 13 SCP caveats: Added footnote that SCPs don't apply to the management account and that
aws:PrincipalArnreflects the calling role at each hop in a role chain, which can produce surprising behavior beyond the two-layer pattern. - Section 10 ECS expansion: Added Task Role vs Execution Role distinction, explained
AWS_CONTAINER_CREDENTIALS_RELATIVE_URIand the169.254.170.2metadata endpoint, clarified differences from EC2 IMDS (169.254.169.254), and added ECS-specific gotchas.
On march 31, 2026 the following corrections were applied:
- STS session duration: Originally stated "15 minutes to 1 hour". Corrected to default 1 hour, configurable up to 12 hours. Role chaining is capped at 1 hour regardless of the role's max session duration setting. (AWS STS AssumeRole API Reference)
- Octopus UI navigation path: Originally stated
Infrastructure -> Accounts. Corrected toDeploy -> Manage -> Accountsper current Octopus Deploy documentation. (Octopus AWS Accounts docs) - Kubernetes Agent Helm chart: Originally used
octopusdeploy/kubernetes-agentas a traditional Helm repo reference. Corrected tooci://registry-1.docker.io/octopusdeploy/kubernetes-agentwhich is the OCI registry path used in the official installation wizard. (Octopus Kubernetes Agent docs)
On April 2, 2026 the following corrections were applied based on technical review:
- Account topology: Article originally assumed Octopus runs in a separate "Tooling Account" (123456789012). Corrected throughout to reflect that Octopus runs in the Dev account (111111111111). When deploying to Dev, the launcher and deployment role are in the same account (same-account AssumeRole). When promoting to Staging/Prod, those are separate AWS accounts requiring cross-account AssumeRole.
- Section 5 intro (deployment process model): Original bullet list implied steps A/B/C/D each targeted different environments (Dev/Staging/Prod) sequentially within one deployment. This is incorrect. In Octopus Deploy, all steps in a single deployment execute against the same environment. Per-step role ARNs are for different permission scopes within the same environment (e.g., CloudFormation access vs ECR access), not for targeting different accounts sequentially. Environment promotion is what changes the target account.
- Section 6 diagrams: Replaced single diagram (showing Calamari A hitting Dev and Calamari B hitting Prod simultaneously) with two diagrams: one showing same-account deployment to Dev, one showing cross-account promotion to Prod. Both show all steps targeting the same environment.
- Section 7 (Octopus configuration): Rewrote to explain Octopus variable scoping as the mechanism for multi-account targeting. Instead of separate AWS Account configs per environment, a single AWS Account with an environment-scoped variable (
#{AWS.DeployRoleArn}) resolves to the correct role ARN based on which environment the release is deployed to. - CloudFormation StackSets: Updated launcher role ARN references from Tooling account to Dev account.
- Section 5.2 permission policies (all three environments): Replaced wildcard permissions (
eks:*,ecr:*,cloudformation:*,s3:*) with specific least-privilege actions. Dev/Staging/Prod now show exact IAM actions required for each service (e.g.,ecr:PutImage+ecr:InitiateLayerUploadinstead ofecr:*), with resources scoped to the specific account. - Prod read-only contradiction: Previous version gave Prod only read permissions, which contradicts automated deployment. Fixed: Prod now has write permissions for CloudFormation and EKS access (necessary for deployment) but tightened controls: no ECR push (Prod pulls images built in Dev/Staging), no CloudFormation DeleteStack, S3 read-only. Safety comes from Octopus manual approval gates and namespace-scoped Kubernetes RBAC, not IAM read-only.
- StackSet PowerUserAccess: Replaced
arn:aws:iam::aws:policy/PowerUserAccesswith parameterized inline policies using CloudFormation conditions (AllowECRPush,AllowCFNDelete). Prod gets restricted permissions via parameter overrides during stack instance creation. - New section 5.4 (kubectl/RBAC gap): Added explanation that IAM permissions alone do not grant Kubernetes API access. Deployment roles must also be mapped via EKS access entries (recommended) or the
aws-authConfigMap. Includes examples for both mechanisms, namespace-scoped RBAC for Prod, and the two-layer authorization model (IAM → cluster, RBAC → namespaces).