Octopus Deploy on AWS

This article will help with your understanding of Octopus Deploy, EKS, IRSA/Pod Identity, and Cross-Account IAM Roles. If you're coming from Azure, you're used to a world where:

Identities are centralized in Azure AD (Entra ID)
Workloads use Managed Identities (system/user-assigned) to get tokens
RBAC is applied to resources and evaluated at the control plane
Your Azure DevOps pipeline agent picks up credentials automatically and you just run az commands

AWS is similar conceptually but wired very differently. Add Octopus Deploy running inside EKS, throw in multi-account deployments, and suddenly you're juggling:

EKS OIDC / IRSA / Pod Identity (what even are these?)
AWS STS and AssumeRole flows (chains of role assumptions?)
Octopus Server vs Calamari (wait, which one talks to AWS?)
Per-step AWS roles and cross-account trust policies (how is this different from a service principal?)

This guide walks through the complete mental model, explicitly mapping AWS concepts to Azure analogies, and using Octopus-in-EKS deploying to multiple AWS accounts as the concrete example.

1. The Players: Who's Who in This System

Let's define every actor in this story so there's no confusion:

AWS Components

AWS EKS -- Managed Kubernetes, similar to AKS
EKS OIDC / IRSA -- EKS's mechanism to bind Kubernetes service accounts to IAM roles (like Azure workload identity for AKS)
EKS Pod Identity -- Newer, AWS-native successor to IRSA that avoids some OIDC complexity
AWS IAM Role -- Roughly equivalent to Azure AD app registration + role assignment; represents an AWS identity with attached permissions
AWS STS (Security Token Service) -- Issues short-lived credentials via AssumeRole and AssumeRoleWithWebIdentity calls
AWS Organizations / Multi-Account -- Pattern where Dev, Staging, Prod, and operational tooling live in separate AWS accounts

Octopus Components

Octopus Server -- The orchestrator/control plane. Runs in your EKS cluster (or could run in ECS, EC2, on-prem)
Calamari -- The worker subprocess that Octopus spawns to actually execute each deployment step
Octopus AWS Account -- Configuration in Octopus UI that tells it which AWS identity to use for steps
Built-in Worker -- When Octopus Server itself runs the step (Calamari subprocess in same pod/container)
Per-step Role ARN -- Optional override that tells Calamari to assume a different role for that specific step

The Critical Insight You Need First

Calamari (the worker subprocess), not Octopus Server, is what calls AWS STS at runtime.

Octopus Server is pure orchestration -- it decides what runs when, spawns Calamari, and passes configuration. Calamari is the thing that:

- Resolves AWS credentials - Calls STS to get temporary credentials - Injects those credentials as environment variables - Runs your actual deployment script (CloudFormation, kubectl, Terraform, etc.)

If you don't internalize this, the rest won't make sense. Octopus Server never holds or uses AWS credentials for deployment steps. Calamari does everything.

The following diagram shows what happens inside a single Calamari step execution:

flowchart LR  
    subgraph Inputs
        code["Code / Script"]
        token["AWS Token\n(from IRSA/Pod Identity)"]
        vars["Step Variables"]
    end

    subgraph Calamari["Calamari Step Execution"]
        step["Step Process"]
    end

    subgraph Actions["Could Be..."]
        cf["Apply CloudFormation"]
        eks["List EKS Pods"]
        ecr["Purge ECR Images"]
        tf["Apply Terraform"]
        create["Create EKS Cluster"]
    end

    subgraph CredResolution["Credential Resolution"]
        role["var: Role ARN"]
        sts["AWS STS"]
        iam["IAM Roles"]
    end

    code --> step
    token --> step
    vars --> step
    step --> cf
    step --> eks
    step --> ecr
    step --> tf
    step --> create

    vars --> role
    role -->|AssumeRole| sts
    sts -->|Temp Credentials| role
    sts --- iam

Calamari receives the script, the ambient AWS token, and step variables (including the target role ARN). It calls STS to exchange the launcher token for scoped temporary credentials, then executes the actual deployment action -- CloudFormation, Terraform, kubectl, whatever the step calls for.

2. The Azure Mental Model (Your Baseline)

Quick mapping so your brain has familiar anchors:

| Azure Concept | AWS Equivalent | |---------------|----------------| | Azure Managed Identity (system/user-assigned) | EKS IRSA / Pod Identity / EC2 instance role | | Azure AD + OAuth2/OIDC federation | AWS IAM OIDC providers + STS AssumeRoleWithWebIdentity | | Azure role assignment (Contributor on subscription) | IAM role with permission policy | | Azure DevOps service connection | Octopus AWS Account | | Azure DevOps pipeline agent | Octopus Calamari (worker process) | | az account set + multiple service connections | sts:AssumeRole into different accounts/roles per step |

In Azure DevOps, you might:

Create a managed identity with Contributor on a resource group
Your pipeline uses a service connection tied to that identity
Pipeline agent picks up credentials from metadata service automatically
Your script just runs az deployment group create and it works

In AWS with Octopus and EKS, the pattern is similar -- but instead of Azure AD tokens, you have:

- STS temporary credentials - IAM role trust policies - Cross-account AssumeRole chains

3. How Octopus Actually Executes a Step: The Full Flow

When you trigger a deployment and a step runs (e.g., "Deploy CloudFormation template" or "Run kubectl script"), here's what happens under the hood:

Step-by-Step Execution

Octopus Server receives the deployment task
- User clicks "Deploy" or webhook fires
- Octopus evaluates which worker should run the step (built-in worker in the Octopus pod, or an external worker)
Octopus Server spawns Calamari
- Calamari is a subprocess/child process
- Octopus passes to Calamari:
  - The step script/content (e.g., CloudFormation template, kubectl commands)
  - AWS Account configuration (which role to use)
  - Any per-step "Assume Role ARN" override
  - Step variables and parameters
Calamari resolves AWS credentials (this is the key part)
- Calamari looks for credentials in this order:
  1. AWS_WEB_IDENTITY_TOKEN_FILE env var (IRSA/Pod Identity injected by EKS)
  2. EC2/ECS metadata service at 169.254.169.254 (instance role)
  3. Explicit access keys from Octopus AWS Account config (if configured)
- If Calamari finds AWS_WEB_IDENTITY_TOKEN_FILE:
  - It reads the JWT token file
  - Calls sts:AssumeRoleWithWebIdentity using that token
  - Gets back temporary credentials for the pod's IAM role
Calamari performs role assumption (if per-step Role ARN is configured)
- Uses the credentials from step 3 (the "launcher" role)
- Calls sts:AssumeRole into the target role (e.g., DevDeployRole in Dev account)
- Gets back new temporary credentials scoped to that deployment role
Calamari injects credentials as environment variables
- Sets in the step's process environment:

AWS_ACCESS_KEY_ID=ASIA...  
AWS_SECRET_ACCESS_KEY=...  
AWS_SESSION_TOKEN=...

The actual step script runs
- Your CloudFormation/Terraform/kubectl/AWS CLI commands execute
- They automatically use the injected credentials
- The script doesn't need to call STS or handle auth -- it just works
Credentials expire after the step
- STS credentials are short-lived (default 1 hour, configurable up to 12 hours; role chaining limited to 1 hour)
- Next step goes through the same flow, potentially with different role

Why This Matters

In Azure, the pipeline agent has one identity for the entire run. In AWS with Octopus, each step can have a completely different identity because Calamari does a fresh AssumeRole call per step.

This is the key to multi-account orchestration: your Octopus pod has one minimal "launcher" identity, and every step assumes whichever role it needs in whichever account.

4. Where IRSA and Pod Identity Fit: The "Launcher" Identity

When Octopus runs inside EKS, you need to answer this question:

"What identity does Calamari have when it first tries to call AWS STS?"

In Azure terms: "Which Managed Identity does my pipeline agent use?"

In AWS EKS, you bind a pod's Kubernetes service account to an IAM role using one of two mechanisms:

4.1 IRSA (IAM Roles for Service Accounts) - The Original Approach

How it works:

AWS hosts an OIDC issuer for your cluster
- Every EKS cluster gets a public OIDC endpoint
- URL format: https://oidc.eks.<region>.amazonaws.com/id/<cluster-unique-id>
- This endpoint serves JWT tokens that identify Kubernetes service accounts
You register that OIDC URL as an IAM Identity Provider
- In AWS IAM console -> Identity Providers -> Add Provider
- Provider type: OpenID Connect
- Provider URL: your cluster's OIDC issuer URL
- Audience: sts.amazonaws.com
You create an IAM role with an OIDC trust policy

{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/XXXXX"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "oidc.eks.us-east-1.amazonaws.com/id/XXXXX:sub": "system:serviceaccount:octopus:octopus-server",
        "oidc.eks.us-east-1.amazonaws.com/id/XXXXX:aud": "sts.amazonaws.com"
      }
    }
  }]
}

This says: "Trust JWTs from my EKS cluster's OIDC issuer, but only for the specific Kubernetes service account octopus/octopus-server"

You annotate the Kubernetes service account

apiVersion: v1  
kind: ServiceAccount  
metadata:  
  name: octopus-server
  namespace: octopus
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/OctoLauncherRole

EKS mutating webhook injects environment variables into the pod
- When your pod starts, EKS automatically injects:
  - AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
  - AWS_ROLE_ARN=arn:aws:iam::123456789012:role/OctoLauncherRole
- Also mounts the JWT token as a file in the pod
AWS SDK automatically picks this up
- When Calamari (or any AWS SDK in the pod) tries to get credentials
- SDK sees AWS_WEB_IDENTITY_TOKEN_FILE in the environment
- Reads the JWT token from that file path
- Calls sts:AssumeRoleWithWebIdentity with the token
- Gets back temporary credentials for OctoLauncherRole

Key point: Calamari doesn't know or care that IRSA is happening. The AWS SDK's credential chain automatically handles it.

4.2 EKS Pod Identity - The Newer, Cleaner Approach

Pod Identity is AWS's answer to some complexity and edge cases with IRSA:

How it works:

Install the EKS Pod Identity Agent add-on

aws eks create-addon --cluster-name my-cluster --addon-name eks-pod-identity-agent

This deploys a DaemonSet on every node
The agent runs on each node and acts as a credential broker
1. Create an IAM role with Pod Identity trust policy

{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Service": "pods.eks.amazonaws.com"
    },
    "Action": ["sts:AssumeRole", "sts:TagSession"]
  }]
}

Notice: No OIDC provider mentioned at all. The trust is directly with the EKS service.

Create a Pod Identity Association

aws eks create-pod-identity-association \  
  --cluster-name my-cluster \
  --namespace octopus \
  --service-account octopus-server \
  --role-arn arn:aws:iam::123456789012:role/OctoLauncherRole

This tells EKS: "When pods in namespace octopus use service account octopus-server, give them credentials for OctoLauncherRole"

Pod Identity Agent injects credentials
- The DaemonSet exposes a node-local credential endpoint at 169.254.170.23:80 (a link-local address, similar to how ECS task roles use 169.254.170.2)
- EKS injects the AWS_CONTAINER_CREDENTIALS_FULL_URI environment variable into the pod, pointing at this endpoint
- The AWS SDK's credential chain discovers this env var and fetches credentials from the agent automatically -- the pod never explicitly queries anything
- Under the hood, the agent calls the eks-auth:AssumeRoleForPodIdentity API (not AssumeRoleWithWebIdentity) to broker the credentials
AWS SDK automatically picks this up
- Same as IRSA from the application's perspective -- the SDK credential chain handles discovery transparently
- Calamari/SDK gets credentials without any explicit STS calls in application code
- This is the same AWS_CONTAINER_CREDENTIALS_FULL_URI mechanism that ECS Fargate uses for task roles, which is why the SDK treats EKS Pod Identity and ECS task roles identically from the application's perspective

Why Pod Identity is better:

No OIDC provider registration - simpler setup
No public OIDC discovery URL fetch - works reliably in fully private clusters. With IRSA, AWS STS must reach the cluster's OIDC discovery endpoint (https://oidc.eks.<region>.amazonaws.com/id/<id>/.well-known/openid-configuration) to validate the pod's JWT. That URL resolves to a public IP address. In fully private EKS clusters with air-gapped VPCs (no NAT gateway, no internet gateway), this endpoint is unreachable -- STS cannot validate the token, AssumeRoleWithWebIdentity fails, and IRSA breaks entirely. Pod Identity sidesteps this because the on-node agent brokers credentials via the eks-auth:AssumeRoleForPodIdentity API, which travels over the AWS private network, not the public OIDC path.
Cleaner trust model - direct EKS service principal, no OIDC federation complexity
Same developer experience - your code doesn't change

Private Cluster Warning: If you are running a fully private EKS cluster and cannot use Pod Identity (e.g., older EKS versions), see section 4.4 below for a Route 53 resolver workaround that allows IRSA to function by forwarding only the OIDC discovery domain to public DNS.

4.3 What This Gives You

Both IRSA and Pod Identity give Calamari a "launcher role" - an initial IAM role identity it can use.

This launcher role is like the "service principal that the agent uses" in Azure DevOps. But here's the key difference:

In Azure: Your pipeline agent's identity usually has the actual permissions it needs (Contributor, etc.)

In AWS: The launcher role typically has only one permission: sts:AssumeRole into other roles

Why? Because you want per-step, per-account, granular control over what each deployment step can do.

4.4 Route 53 Resolver Workaround: IRSA in Private Clusters

If you must use IRSA in a fully private EKS cluster (e.g., Pod Identity is unavailable on your EKS version), the core problem is that STS needs to reach the public OIDC discovery URL to validate the pod's JWT. You can solve this with Route 53 Resolver Endpoints and split-horizon DNS without opening general internet access:

Create a Route 53 Outbound Resolver Endpoint in your VPC
Create a forwarding rule that matches only the OIDC discovery domain (oidc.eks.<region>.amazonaws.com) and forwards it to public DNS resolvers (e.g., 1.1.1.1, 8.8.8.8)
All other DNS traffic continues to resolve via VPC-internal DNS and VPC endpoints as normal

# Create outbound resolver endpoint
aws route53resolver create-resolver-endpoint \  
  --creator-request-id oidc-resolver \
  --direction OUTBOUND \
  --security-group-ids sg-0123456789abcdef0 \
  --ip-addresses SubnetId=subnet-aaa,Ip=10.0.1.10 SubnetId=subnet-bbb,Ip=10.0.2.10

# Create forwarding rule for OIDC domain only
aws route53resolver create-resolver-rule \  
  --creator-request-id oidc-forward \
  --rule-type FORWARD \
  --domain-name "oidc.eks.us-east-1.amazonaws.com" \
  --resolver-endpoint-id rslvr-out-xxxxxxxxx \
  --target-ips Ip=1.1.1.1 Ip=8.8.8.8

# Associate the rule with your VPC
aws route53resolver associate-resolver-rule \  
  --resolver-rule-id rslvr-rr-xxxxxxxxx \
  --vpc-id vpc-xxxxxxxxx

This gives STS just enough DNS resolution to validate OIDC tokens while keeping everything else private. You still need a NAT gateway or AWS PrivateLink path for the actual HTTPS fetch of the OIDC discovery document -- the DNS forwarding alone resolves the name but does not route the traffic. In most cases, upgrading to Pod Identity is the cleaner long-term solution.

4.5 The Octopus Kubernetes Agent: Bypassing OIDC Entirely

For fully private clusters where neither Pod Identity nor the Route 53 workaround is viable, there is a fundamentally different architecture: install the Octopus Kubernetes Agent directly inside the cluster.

How it works:

Install the agent via Helm into the target EKS cluster:

helm upgrade --install --atomic octopus-agent \  
  oci://registry-1.docker.io/octopusdeploy/kubernetes-agent \
  --namespace octopus-agent \
  --create-namespace \
  --set agent.serverUrl="https://your-octopus-server" \
  --set agent.serverCommsAddress="https://your-octopus-server:10943" \
  --set agent.space="Default" \
  --set agent.targetName="private-eks-cluster" \
  --set agent.bearerToken="API-XXXXXXXXXXXX"

The agent runs in poll mode -- it dials outbound to Octopus Server over HTTPS (port 10943), asking "do you have work for me?" This means:
- No inbound connections to the cluster required
- No OIDC discovery URL validation needed
- No IRSA or Pod Identity configuration required for the Octopus→Kubernetes communication path
The agent already has in-cluster RBAC -- because it runs as a pod inside the cluster, it uses a Kubernetes service account with the RBAC permissions you grant it. Octopus Server sends deployment instructions; the agent executes them using its native Kubernetes access.
For AWS API calls, the agent pod can still use Pod Identity or IRSA to get a launcher role, then assume deployment roles per step -- the same two-layer pattern described in section 5. The difference is that the Octopus→cluster connectivity problem is eliminated.

When to use the Kubernetes Agent:

Fully air-gapped clusters where no public DNS or internet path exists
Multiple private clusters across accounts -- install one agent per cluster, all poll back to a central Octopus Server
Simplified networking -- outbound-only connectivity from the cluster to Octopus Server
Hybrid scenarios -- Octopus Server runs outside AWS (on-prem or different cloud) and deploys into private EKS clusters

Trade-off: You now manage an agent per cluster instead of having a single centralized Octopus-in-EKS installation. For organizations with many private clusters, this is often preferable to complex networking workarounds.

5. The Two-Layer Role Pattern: "Launcher" + "Deployment Roles"

Here's where AWS diverges significantly from the Azure mental model.

You want to be able to:

- Run Step A with CloudFormation access to deploy infrastructure in the current environment - Run Step B with ECR access to push/pull container images in the current environment - Run Step C with EKS access to apply Kubernetes manifests in the current environment - All steps in one deployment run against the same environment — the environment selection (Dev, Staging, Prod) determines which AWS account is targeted

In Octopus Deploy, all steps in a single deployment execute against the same environment. When you deploy Release 1.0 to Dev, every step runs against Dev. When you promote Release 1.0 to Staging, every step runs against Staging. Per-step role ARNs are for different permission scopes within the same account (e.g., Step 1 needs CloudFormation, Step 2 needs ECR, Step 3 needs EKS — all in the same AWS account for that environment). Octopus variable scoping is the mechanism that changes which AWS account is targeted when you promote across environments.

Instead of giving your Octopus pod's identity all those permissions combined (which would be a security nightmare), you use role assumption chains.

5.1 Layer 1: The Launcher Role

This is the role attached to your Octopus pod via IRSA or Pod Identity. Octopus runs in the Dev account — the same AWS account as your Dev environment. When deploying to Dev, the launcher and deployment resources are in the same account. When promoting to Staging or Prod, those are separate AWS accounts requiring cross-account AssumeRole.

Example:

Account: Dev Account (111111111111) - where Octopus EKS cluster runs  
Role:    OctoLauncherRole  
ARN:     arn:aws:iam::111111111111:role/OctoLauncherRole

Trust Policy (Pod Identity):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Service": "pods.eks.amazonaws.com"
    },
    "Action": ["sts:AssumeRole", "sts:TagSession"]
  }]
}

Permission Policy (minimal):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": "sts:AssumeRole",
    "Resource": [
      "arn:aws:iam::111111111111:role/DevDeployRole",
      "arn:aws:iam::222222222222:role/StagingDeployRole",
      "arn:aws:iam::333333333333:role/ProdDeployRole"
    ]
  }]
}

This role: - Can't touch any actual AWS resources (no EKS, S3, CloudFormation permissions) - Can only assume other specific roles - Acts as the "bootstrap identity" for Calamari

Think of it like a "service principal that can only impersonate other service principals"

5.2 Layer 2: Deployment Roles (Per Account/Environment)

Now you create deployment roles in each target AWS account. Because Octopus runs in the Dev account, the Dev deployment role is in the same account as the launcher — this is a same-account AssumeRole. Staging and Prod are separate accounts and require cross-account role trust.

How Octopus selects the right role: Octopus variable scoping drives this. You define a variable AWS.DeployRoleArn (or similar) with different values scoped to each environment. When you deploy to Dev, Octopus resolves the Dev role ARN. When you promote the same release to Staging, Octopus resolves the Staging role ARN. The deployment process definition is identical — the environment selection is what changes the target account.

Dev Account (111111111111) — Same Account as Launcher

Role: DevDeployRole ARN: arn:aws:iam::111111111111:role/DevDeployRole

Trust Policy:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
    },
    "Action": "sts:AssumeRole"
  }]
}

This says: "Allow the OctoLauncherRole from the same Dev account (111111111111) to assume me"

Permission Policy (what Dev steps can actually do):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EKSAccess",
      "Effect": "Allow",
      "Action": [
        "eks:DescribeCluster",
        "eks:ListClusters"
      ],
      "Resource": "arn:aws:eks:us-east-1:111111111111:cluster/*"
    },
    {
      "Sid": "ECRPushPull",
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ECRRepoAccess",
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer",
        "ecr:PutImage",
        "ecr:InitiateLayerUpload",
        "ecr:UploadLayerPart",
        "ecr:CompleteLayerUpload",
        "ecr:DescribeRepositories",
        "ecr:DescribeImages",
        "ecr:ListImages"
      ],
      "Resource": "arn:aws:ecr:us-east-1:111111111111:repository/*"
    },
    {
      "Sid": "CloudFormation",
      "Effect": "Allow",
      "Action": [
        "cloudformation:CreateStack",
        "cloudformation:UpdateStack",
        "cloudformation:DeleteStack",
        "cloudformation:DescribeStacks",
        "cloudformation:DescribeStackEvents",
        "cloudformation:GetTemplate",
        "cloudformation:ValidateTemplate",
        "cloudformation:ListStacks"
      ],
      "Resource": "arn:aws:cloudformation:us-east-1:111111111111:stack/dev-*/*"
    },
    {
      "Sid": "S3ArtifactAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::dev-artifacts-*",
        "arn:aws:s3:::dev-artifacts-*/*"
      ]
    }
  ]
}

Note on eks:DescribeCluster: This is the only IAM permission needed for kubectl operations. When Calamari runs kubectl apply, it calls eks:DescribeCluster to get the cluster's API endpoint and CA certificate, then authenticates to the Kubernetes API server using the IAM role. But IAM permissions alone are not sufficient -- you must also grant the role Kubernetes RBAC access (see section 5.4 below).

Staging Account (222222222222)

Role: StagingDeployRole ARN: arn:aws:iam::222222222222:role/StagingDeployRole

Trust Policy (cross-account):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
    },
    "Action": "sts:AssumeRole"
  }]
}

Permission Policy: Same structure as Dev, scoped to this account's resources:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EKSAccess",
      "Effect": "Allow",
      "Action": [
        "eks:DescribeCluster",
        "eks:ListClusters"
      ],
      "Resource": "arn:aws:eks:us-east-1:222222222222:cluster/*"
    },
    {
      "Sid": "ECRPushPull",
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ECRRepoAccess",
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer",
        "ecr:PutImage",
        "ecr:InitiateLayerUpload",
        "ecr:UploadLayerPart",
        "ecr:CompleteLayerUpload",
        "ecr:DescribeRepositories",
        "ecr:DescribeImages",
        "ecr:ListImages"
      ],
      "Resource": "arn:aws:ecr:us-east-1:222222222222:repository/*"
    },
    {
      "Sid": "CloudFormation",
      "Effect": "Allow",
      "Action": [
        "cloudformation:CreateStack",
        "cloudformation:UpdateStack",
        "cloudformation:DeleteStack",
        "cloudformation:DescribeStacks",
        "cloudformation:DescribeStackEvents",
        "cloudformation:GetTemplate",
        "cloudformation:ValidateTemplate",
        "cloudformation:ListStacks"
      ],
      "Resource": "arn:aws:cloudformation:us-east-1:222222222222:stack/staging-*/*"
    },
    {
      "Sid": "S3ArtifactAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::staging-artifacts-*",
        "arn:aws:s3:::staging-artifacts-*/*"
      ]
    }
  ]
}

The Staging deployment role has the same permission actions as Dev -- because the deployment process is the same. The difference is resource scoping (account 222222222222 resources) and the trust policy (cross-account from the Dev account where Octopus runs).

Production Account (333333333333)

Role: ProdDeployRole ARN: arn:aws:iam::333333333333:role/ProdDeployRole

Trust Policy (cross-account):

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::111111111111:role/OctoLauncherRole"
    },
    "Action": "sts:AssumeRole"
  }]
}

Permission Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EKSAccess",
      "Effect": "Allow",
      "Action": [
        "eks:DescribeCluster",
        "eks:ListClusters"
      ],
      "Resource": "arn:aws:eks:us-east-1:333333333333:cluster/*"
    },
    {
      "Sid": "ECRPullOnly",
      "Effect": "Allow",
      "Action": [
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "*"
    },
    {
      "Sid": "ECRRepoAccess",
      "Effect": "Allow",
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer",
        "ecr:DescribeRepositories",
        "ecr:DescribeImages",
        "ecr:ListImages"
      ],
      "Resource": "arn:aws:ecr:us-east-1:333333333333:repository/*"
    },
    {
      "Sid": "CloudFormation",
      "Effect": "Allow",
      "Action": [
        "cloudformation:CreateStack",
        "cloudformation:UpdateStack",
        "cloudformation:DescribeStacks",
        "cloudformation:DescribeStackEvents",
        "cloudformation:GetTemplate",
        "cloudformation:ValidateTemplate",
        "cloudformation:ListStacks"
      ],
      "Resource": "arn:aws:cloudformation:us-east-1:333333333333:stack/prod-*/*"
    },
    {
      "Sid": "S3ArtifactAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::prod-artifacts-*",
        "arn:aws:s3:::prod-artifacts-*/*"
      ]
    }
  ]
}

The Prod role must have write permissions for the deployment process to work. If all steps in a deployment target the same environment, and the deployment process is the same for Dev and Prod, then Prod needs the ability to actually deploy -- create/update CloudFormation stacks, apply Kubernetes manifests via kubectl, etc. The control point for Prod safety is not IAM read-only permissions (which would make automated deployment impossible). Instead, Prod safety comes from:

Octopus manual approval gates -- require a human to approve before a release proceeds to Prod
Kubernetes RBAC scoping -- limit the role to specific namespaces
CloudFormation stack policies -- prevent deletion of critical resources
ECR pull-only -- Prod doesn't push images; it pulls images that were pushed in Dev/Staging. (Prod's ECR access is read-only, unlike Dev/Staging which has push permissions.)
No CloudFormation DeleteStack -- notice Prod lacks cloudformation:DeleteStack compared to Dev/Staging
S3 read-only -- Prod reads artifacts; it doesn't produce them

5.3 Why This Pattern?

This is defense in depth:

Pod compromise - if the Octopus pod is compromised, attacker only has OctoLauncherRole (in Dev account), which can only assume specific deployment roles and cannot directly touch resources
Blast radius - each deployment role is scoped to exactly what that environment needs; Dev is same-account, Staging/Prod require explicit cross-account trust
Audit trail - CloudTrail shows exact role assumption chain: OctoLauncherRole -> StagingDeployRole -> s3:PutObject; for Dev deployments the chain stays within account 111111111111
Granular control - Dev/Staging have full deploy permissions, Prod is tightened (no ECR push, no stack deletion, no artifact writes, namespace-scoped RBAC), all from the same Octopus installation
Environment-driven targeting - Octopus variable scoping ensures the same deployment process resolves to the right role ARN per environment without any code changes

5.4 The kubectl/RBAC Gap: IAM Is Not Enough

This is a critical gap that catches teams by surprise: IAM permissions alone do not grant Kubernetes API access. Your deployment role can have eks:DescribeCluster and every EKS permission in the IAM catalog, but kubectl apply will still fail with error: You must be logged in to the server (Unauthorized) unless the role is also mapped to Kubernetes RBAC.

EKS has two mechanisms for this:

Option 1: EKS Access Entries (Recommended -- newer clusters)

EKS access entries are the AWS-native approach, managed via API without touching cluster internals:

# Grant the Dev deployment role access to the Dev cluster
aws eks create-access-entry \  
  --cluster-name dev-cluster \
  --principal-arn arn:aws:iam::111111111111:role/DevDeployRole \
  --type STANDARD

# Associate a Kubernetes RBAC policy
aws eks associate-access-policy \  
  --cluster-name dev-cluster \
  --principal-arn arn:aws:iam::111111111111:role/DevDeployRole \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
  --access-scope type=namespace,namespaces=app-dev

For Staging/Prod (cross-account), you create access entries in those clusters pointing to the respective deployment roles:

# In Staging cluster (account 222222222222)
aws eks create-access-entry \  
  --cluster-name staging-cluster \
  --principal-arn arn:aws:iam::222222222222:role/StagingDeployRole \
  --type STANDARD

aws eks associate-access-policy \  
  --cluster-name staging-cluster \
  --principal-arn arn:aws:iam::222222222222:role/StagingDeployRole \
  --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSClusterAdminPolicy \
  --access-scope type=namespace,namespaces=app-staging

Option 2: aws-auth ConfigMap (Legacy -- all clusters)

For older clusters or clusters not using access entries, you map IAM roles to Kubernetes groups via the aws-auth ConfigMap in the kube-system namespace:

apiVersion: v1  
kind: ConfigMap  
metadata:  
  name: aws-auth
  namespace: kube-system
data:  
  mapRoles: |
    # Deployment role for this environment
    - rolearn: arn:aws:iam::111111111111:role/DevDeployRole
      username: octopus-deploy
      groups:
        - system:masters  # Full cluster admin -- tighten this in Staging/Prod
    # Node instance role (already present -- don't remove)
    - rolearn: arn:aws:iam::111111111111:role/eks-node-role
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes

Warning: Editing aws-auth incorrectly can lock you out of the cluster. Always verify the existing content before modifying. Never remove the node instance role entries.

For Prod, use namespace-scoped RBAC instead of system:masters:

# ClusterRole for deployment (apply to specific namespace)
apiVersion: rbac.authorization.k8s.io/v1  
kind: Role  
metadata:  
  namespace: app-prod
  name: octopus-deployer
rules:  
  - apiGroups: ["", "apps", "batch"]
    resources: ["deployments", "services", "configmaps", "secrets", "pods", "jobs"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["networking.k8s.io"]
    resources: ["ingresses"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1  
kind: RoleBinding  
metadata:  
  namespace: app-prod
  name: octopus-deployer-binding
subjects:  
  - kind: Group
    name: octopus-deployers
    apiGroup: rbac.authorization.k8s.io
roleRef:  
  kind: Role
  name: octopus-deployer
  apiGroup: rbac.authorization.k8s.io

Then in aws-auth, map the Prod role to the octopus-deployers group instead of system:masters.

The two-layer authorization model: IAM controls whether the role can reach the cluster (eks:DescribeCluster). Kubernetes RBAC controls what the role can do once authenticated. You need both. This is fundamentally different from Azure AKS, where Azure AD RBAC can grant both the Azure-level and Kubernetes-level permissions in one place.

6. The Big Picture: How It All Connects

These diagrams show the two deployment scenarios: deploying to the same account where Octopus runs (Dev), and deploying cross-account (Prod). In both cases, all steps in the deployment target the same environment -- the environment selection determines which account is targeted.

Deploying to Dev (Same Account)

When you deploy a release to Dev, Octopus, the launcher role, and the deployment role are all in the same AWS account. The AssumeRole call is same-account:

flowchart TB  
    subgraph org["AWS Organization"]
        subgraph dev["Dev Account (111111111111) — Octopus runs here"]
            sts["AWS STS"]

            subgraph eks["EKS Cluster"]
                subgraph pod["Octopus Pod"]
                    octopus["Octopus Server"]
                    calA["Calamari A\n(CloudFormation step)"]
                    calB["Calamari B\n(EKS deploy step)"]
                end
            end

            irsa["IRSA or Pod Identity\n(OIDC Issuer)"]
            devRole["DevDeployRole"]
            devResources["Dev Resources\n(EKS, ECR, S3, CloudFormation)"]

            irsa -->|"env vars:\nAWS_WEB_IDENTITY_TOKEN_FILE\nAWS_ROLE_ARN"| pod
            pod -->|"AssumeRoleWithWebIdentity\n(launcher token)"| sts
            sts -->|"Temp creds:\nOctoLauncherRole"| pod

            octopus -->|"spawns"| calA
            octopus -->|"spawns"| calB

            calA -->|"sts:AssumeRole\n(DevDeployRole)"| sts
            sts -->|"Temp creds"| calA
            calA -.->|"operates on"| devResources
            devResources --- devRole

            calB -->|"sts:AssumeRole\n(DevDeployRole)"| sts
            sts -->|"Temp creds"| calB
            calB -.->|"operates on"| devResources
        end
    end

Both Calamari steps assume the same DevDeployRole because all steps in this deployment target Dev. Per-step role overrides are still possible if different steps need different permission scopes within Dev (e.g., one step needs CloudFormation + IAM, another only needs ECR read).

Promoting to Prod (Cross-Account)

When you promote the same release to Prod, Octopus variable scoping resolves the Prod role ARN instead. Calamari now makes cross-account AssumeRole calls from the Dev account into the Prod account:

flowchart TB  
    subgraph org["AWS Organization"]
        subgraph dev["Dev Account (111111111111) — Octopus runs here"]
            sts["AWS STS"]

            subgraph eks["EKS Cluster"]
                subgraph pod["Octopus Pod"]
                    octopus["Octopus Server"]
                    calA["Calamari A\n(CloudFormation step)"]
                    calB["Calamari B\n(EKS deploy step)"]
                end
            end

            irsa["IRSA or Pod Identity"]

            irsa -->|"launcher token"| pod
            pod -->|"AssumeRoleWithWebIdentity"| sts
            sts -->|"OctoLauncherRole creds"| pod
        end

        subgraph prod["Prod Account (333333333333)"]
            prodRole["ProdDeployRole"]
            prodResources["Prod Resources\n(EKS, ECR, S3)"]
        end

        octopus -->|"spawns"| calA
        octopus -->|"spawns"| calB

        calA -->|"sts:AssumeRole\n(ProdDeployRole)"| sts
        sts -->|"Cross-account\ntemp creds"| calA
        calA -.->|"operates on"| prodResources
        prodResources --- prodRole

        calB -->|"sts:AssumeRole\n(ProdDeployRole)"| sts
        sts -->|"Cross-account\ntemp creds"| calB
        calB -.->|"operates on"| prodResources
    end

Important: Calamari processes run inside the Octopus pod in the Dev account -- they are subprocesses of the Octopus Server, not remote agents deployed in target accounts. They assume IAM roles in the target accounts via STS and then make API calls to those accounts, but the process itself executes locally in the Octopus pod. If you need actual execution inside a target account's VPC (e.g., for private API endpoints), you'd deploy external workers there -- but that's a different topology.

The key insight: the deployment process is identical for Dev and Prod. What changes is the environment selection, which causes Octopus to resolve different variable values -- including the target role ARN. Octopus Server spawns Calamari subprocesses, each of which independently calls STS using the pod's launcher credentials, then assumes the deployment role that the environment's variable scoping resolves to.

7. How Octopus Configuration Maps to This

In Octopus UI under Deploy -> Manage -> Accounts -> Add Account -> AWS Account, you configure a single account that uses ambient credentials from IRSA/Pod Identity:

AWS Account Configuration

Account name: AWS Deploy

Authentication method: Execute using the AWS service role for an EC2 instance (This tells Octopus: "Don't use stored keys; Calamari should pick up ambient credentials from IRSA/Pod Identity")

Note on the label: The "EC2 instance" wording is misleading -- this option does not require EC2. It means "use the AWS SDK's default credential chain to resolve ambient credentials," which works equally for EKS IRSA, EKS Pod Identity, ECS Task Roles, and EC2 Instance Roles. The label predates EKS and ECS Fargate support in Octopus. Functionally, selecting this just tells Calamari: "don't use stored access keys; discover credentials from the environment."

Access Key / Secret Key: (leave blank)

Assume Role (optional): (leave blank at account level)

Variable Scoping: How Environments Target Different Accounts

This is the key to understanding how Octopus handles multi-account deployments. You define a project variable for the deployment role ARN, with different values scoped to each environment:

| Variable Name | Value | Scoped To | |---------------|-------|-----------| | AWS.DeployRoleArn | arn:aws:iam::111111111111:role/DevDeployRole | Dev | | AWS.DeployRoleArn | arn:aws:iam::222222222222:role/StagingDeployRole | Staging | | AWS.DeployRoleArn | arn:aws:iam::333333333333:role/ProdDeployRole | Prod |

Then in your deployment process, each step uses:

AWS Account: Select AWS Deploy

Assume a different AWS Role: #{AWS.DeployRoleArn}

When you deploy Release 1.0 to Dev, Octopus resolves #{AWS.DeployRoleArn} to the Dev role ARN. When you promote the same release to Staging, Octopus resolves it to the Staging role ARN. The deployment process definition is identical across environments -- the environment selection is what changes the target account.

This tells Calamari:

1. Use ambient credentials from Pod Identity (OctoLauncherRole in Dev account)

2. Call sts:AssumeRole into whichever role ARN the environment resolved

3. For Dev: same-account AssumeRole (both launcher and target in 111111111111)

4. For Staging/Prod: cross-account AssumeRole (launcher in 111111111111, target in 222222222222 or 333333333333)

5. Use those scoped credentials for the step

Alternative: Octopus OIDC (If You Don't Want IRSA/Pod Identity)

Instead of "Execute using service role", you can configure:

Authentication method: Use OpenID Connect

Role ARN: arn:aws:iam::111111111111:role/OctoDevRole

In this mode:

- Octopus Server acts as an OIDC issuer - Octopus mints a JWT token scoped to the deployment - Calamari calls sts:AssumeRoleWithWebIdentity using Octopus's JWT - AWS STS validates the token by fetching https://your-octopus-server/.well-known/openid-configuration

Why you might not want this: Requires Octopus to have a publicly-reachable OIDC discovery endpoint. If Octopus is fully private, STS can't validate the token.

When IRSA/Pod Identity is better: Your Octopus installation can be completely private. The EKS OIDC issuer (for IRSA) or Pod Identity service is AWS-managed and public, so STS can always validate.

8. The Complete Flow: Step Execution with Role Assumption

Let's trace a real deployment step end-to-end:

Scenario: Deploy a CloudFormation stack to Dev account (same account where Octopus runs)

Configuration: - Octopus runs in EKS cluster in Dev account (111111111111) - Octopus pod uses service account with Pod Identity -> OctoLauncherRole - Step configured with AWS Account AWS Deploy, Role ARN resolved via variable scoping to arn:aws:iam::111111111111:role/DevDeployRole

Step-by-step execution:

User triggers deployment in Octopus UI
Octopus Server evaluates the step
- Identifies that it should run on built-in worker (in the Octopus pod)
- Spawns Calamari subprocess
Octopus Server passes to Calamari:
- CloudFormation template file
- Stack name, parameters
- AWS Account config: "use ambient service role"
- Per-step Role ARN: arn:aws:iam::111111111111:role/DevDeployRole
Calamari resolves base credentials:
- Checks environment variables
- Finds AWS_ROLE_ARN=arn:aws:iam::111111111111:role/OctoLauncherRole (injected by Pod Identity)
- Finds AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/... (or Pod Identity agent endpoint)
- AWS SDK automatically calls STS to get credentials for OctoLauncherRole
- Calamari now has temp creds for the launcher role
Calamari performs role assumption:
- Using OctoLauncherRole credentials, calls:

aws sts assume-role \  
  --role-arn arn:aws:iam::111111111111:role/DevDeployRole \
  --role-session-name octopus-deploy-12345

STS checks: "Does DevDeployRole trust OctoLauncherRole?" -> Yes (trust policy)
STS checks: "Can OctoLauncherRole assume DevDeployRole?" -> Yes (launcher has sts:AssumeRole permission for this ARN)
STS returns temporary credentials for DevDeployRole (same-account)
1. Calamari injects credentials:

export AWS_ACCESS_KEY_ID=ASIAIOSFODNN7EXAMPLE  
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY  
export AWS_SESSION_TOKEN=AQoEXAMPLEH4aoAH0gNCAPyJxz4BlCFFxWNE1OPTgk5TthT+FvwqnKwRcOIfrRh3c/...  
export AWS_DEFAULT_REGION=us-east-1

CloudFormation step executes:

aws cloudformation deploy \  
  --template-file template.yaml \
  --stack-name my-app-stack \
  --capabilities CAPABILITY_IAM

AWS CLI uses the injected credentials
Operates as DevDeployRole in account 111111111111
Can create CloudFormation stacks, EKS clusters, etc. (per DevDeployRole permissions)
1. Step completes, credentials discarded
Temporary credentials expire (default 1 hour, configurable up to 12 hours; role chaining limited to 1 hour)
Next step goes through the same flow, potentially with different role

What Happens in CloudTrail (Audit)

When you look at CloudTrail logs:

In Dev Account (111111111111) -- where Octopus runs:

{
  "eventName": "AssumeRole",
  "requestParameters": {
    "roleArn": "arn:aws:iam::111111111111:role/DevDeployRole",
    "roleSessionName": "octopus-deploy-12345"
  },
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AIDACKCEVSQ6C2EXAMPLE:octopus-pod",
    "arn": "arn:aws:sts::111111111111:assumed-role/OctoLauncherRole/octopus-pod"
  }
}

In Dev Account (111111111111):

{
  "eventName": "CreateStack",
  "requestParameters": {
    "stackName": "my-app-stack",
    "templateURL": "https://..."
  },
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AIDACKCEVSQ6C2EXAMPLE:octopus-deploy-12345",
    "arn": "arn:aws:sts::111111111111:assumed-role/DevDeployRole/octopus-deploy-12345"
  },
  "sourceIPAddress": "10.0.5.23"
}

You can trace the entire chain: pod -> OctoLauncherRole -> DevDeployRole -> CloudFormation action.

9. Multi-Account Strategy: How Many Roles?

When you have multiple microservices deploying to multiple environments, you need to decide: how many deployment roles?

Option 1: One Deployment Role Per Environment (Simplest)

Dev Account (111111111111) -- Octopus runs here  
  +-- OctoLauncherRole
  +-- DevDeployRole  (all 8 microservices use this)

Staging Account (222222222222)  
  +-- StagingDeployRole

Prod Account (333333333333)  
  +-- ProdDeployRole

Total: 4 roles (launcher + 3 deployment)

Pros: - Simple to manage - Fast iteration in Dev/Staging - One Octopus AWS Account config per environment

Cons: - Every microservice deployment has access to all resources in the account - No per-service blast radius control - Harder to audit "which service did what"

Option 2: One Role Per Microservice Per Environment (Maximum Isolation)

Dev Account (111111111111)  
  +-- UserServiceDevRole
  +-- PaymentServiceDevRole
  +-- NotificationServiceDevRole
  +-- AuthServiceDevRole
  +-- InventoryServiceDevRole
  +-- OrderServiceDevRole
  +-- ShippingServiceDevRole
  +-- AnalyticsServiceDevRole

Staging Account (222222222222)  
  +-- (same 8 roles)

Prod Account (333333333333)  
  +-- (same 8 roles)

Total: 8 microservices x 3 environments = 24 deployment roles (plus 1 launcher in Dev = 25 total)

Pros: - Perfect least privilege - UserService can't touch PaymentService resources - Compromised role only affects one service - Clear audit trail per service

Cons: - 25 roles to manage (permission drift risk) - More Octopus configuration (8 AWS Accounts per environment, or 8 per-step role ARN overrides)

Option 3: Hybrid - Group by Blast Radius (Recommended)

Dev Account  
  +-- DevDeployRole (all services)

Staging Account  
  +-- StagingDeployRole (all services)

Prod Account  
  +-- ProdDataPlaneRole  (low-risk: users, inventory, shipping, analytics, notifications)
  +-- ProdControlPlaneRole (high-risk: payments, auth, orders)

Total: 5 roles

ProdControlPlaneRole has highly restricted permissions + requires manual approval in Octopus before use.

Pros: - Balance between security and maintainability - Production gets extra protection where it matters - Dev/Staging stay simple for velocity

Cons: - Still some shared blast radius in Prod data plane

Automation: AWS CloudFormation StackSets

To avoid manually creating roles in each account, use StackSets:

Template: deploy-role.yaml

Parameters:  
  Environment:
    Type: String
    AllowedValues: [Dev, Staging, Prod]
  LauncherRoleArn:
    Type: String
    Description: ARN of OctoLauncherRole in account where Octopus runs
  AllowECRPush:
    Type: String
    AllowedValues: ['true', 'false']
    Default: 'true'
    Description: Whether this role can push to ECR (false for Prod)
  AllowCFNDelete:
    Type: String
    AllowedValues: ['true', 'false']
    Default: 'true'
    Description: Whether this role can delete CloudFormation stacks (false for Prod)

Conditions:  
  CanPushECR: !Equals [!Ref AllowECRPush, 'true']
  CanDeleteCFN: !Equals [!Ref AllowCFNDelete, 'true']

Resources:  
  DeployRole:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub '${Environment}DeployRole'
      AssumeRolePolicyDocument:
        Statement:
          - Effect: Allow
            Principal:
              AWS: !Ref LauncherRoleArn
            Action: sts:AssumeRole

  DeployPolicy:
    Type: AWS::IAM::Policy
    Properties:
      PolicyName: !Sub '${Environment}DeployPolicy'
      Roles: [!Ref DeployRole]
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: EKSAccess
            Effect: Allow
            Action:
              - eks:DescribeCluster
              - eks:ListClusters
            Resource: !Sub 'arn:aws:eks:${AWS::Region}:${AWS::AccountId}:cluster/*'
          - Sid: ECRAuth
            Effect: Allow
            Action:
              - ecr:GetAuthorizationToken
            Resource: '*'
          - Sid: ECRPull
            Effect: Allow
            Action:
              - ecr:BatchCheckLayerAvailability
              - ecr:BatchGetImage
              - ecr:GetDownloadUrlForLayer
              - ecr:DescribeRepositories
              - ecr:DescribeImages
              - ecr:ListImages
            Resource: !Sub 'arn:aws:ecr:${AWS::Region}:${AWS::AccountId}:repository/*'
          - Sid: CloudFormationReadWrite
            Effect: Allow
            Action:
              - cloudformation:CreateStack
              - cloudformation:UpdateStack
              - cloudformation:DescribeStacks
              - cloudformation:DescribeStackEvents
              - cloudformation:GetTemplate
              - cloudformation:ValidateTemplate
              - cloudformation:ListStacks
            Resource: !Sub 'arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Environment}-*/*'
          - Sid: S3ArtifactRead
            Effect: Allow
            Action:
              - s3:GetObject
              - s3:ListBucket
            Resource:
              - !Sub 'arn:aws:s3:::${Environment}-artifacts-*'
              - !Sub 'arn:aws:s3:::${Environment}-artifacts-*/*'

  # Conditional: ECR push (Dev/Staging only, not Prod)
  ECRPushPolicy:
    Type: AWS::IAM::Policy
    Condition: CanPushECR
    Properties:
      PolicyName: !Sub '${Environment}ECRPushPolicy'
      Roles: [!Ref DeployRole]
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: ECRPush
            Effect: Allow
            Action:
              - ecr:PutImage
              - ecr:InitiateLayerUpload
              - ecr:UploadLayerPart
              - ecr:CompleteLayerUpload
            Resource: !Sub 'arn:aws:ecr:${AWS::Region}:${AWS::AccountId}:repository/*'

  # Conditional: S3 artifact write (Dev/Staging only)
  S3WritePolicy:
    Type: AWS::IAM::Policy
    Condition: CanPushECR
    Properties:
      PolicyName: !Sub '${Environment}S3WritePolicy'
      Roles: [!Ref DeployRole]
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: S3ArtifactWrite
            Effect: Allow
            Action:
              - s3:PutObject
            Resource: !Sub 'arn:aws:s3:::${Environment}-artifacts-*/*'

  # Conditional: CloudFormation delete (Dev/Staging only)
  CFNDeletePolicy:
    Type: AWS::IAM::Policy
    Condition: CanDeleteCFN
    Properties:
      PolicyName: !Sub '${Environment}CFNDeletePolicy'
      Roles: [!Ref DeployRole]
      PolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Sid: CFNDelete
            Effect: Allow
            Action:
              - cloudformation:DeleteStack
            Resource: !Sub 'arn:aws:cloudformation:${AWS::Region}:${AWS::AccountId}:stack/${Environment}-*/*'

Outputs:  
  DeployRoleArn:
    Value: !GetAtt DeployRole.Arn
    Description: ARN for Octopus variable scoping

Deploy to all accounts with environment-appropriate permissions:

# Create the StackSet
aws cloudformation create-stack-set \  
  --stack-set-name deployment-roles \
  --template-body file://deploy-role.yaml \
  --parameters \
    ParameterKey=LauncherRoleArn,ParameterValue=arn:aws:iam::111111111111:role/OctoLauncherRole \
  --capabilities CAPABILITY_NAMED_IAM

# Dev (same account as Octopus -- full permissions)
aws cloudformation create-stack-instances \  
  --stack-set-name deployment-roles \
  --accounts 111111111111 \
  --regions us-east-1 \
  --parameter-overrides \
    ParameterKey=Environment,ParameterValue=Dev \
    ParameterKey=AllowECRPush,ParameterValue=true \
    ParameterKey=AllowCFNDelete,ParameterValue=true

# Staging (cross-account -- full permissions)
aws cloudformation create-stack-instances \  
  --stack-set-name deployment-roles \
  --accounts 222222222222 \
  --regions us-east-1 \
  --parameter-overrides \
    ParameterKey=Environment,ParameterValue=Staging \
    ParameterKey=AllowECRPush,ParameterValue=true \
    ParameterKey=AllowCFNDelete,ParameterValue=true

# Prod (cross-account -- no ECR push, no CFN delete)
aws cloudformation create-stack-instances \  
  --stack-set-name deployment-roles \
  --accounts 333333333333 \
  --regions us-east-1 \
  --parameter-overrides \
    ParameterKey=Environment,ParameterValue=Prod \
    ParameterKey=AllowECRPush,ParameterValue=false \
    ParameterKey=AllowCFNDelete,ParameterValue=false

The template uses CloudFormation conditions to vary permissions by environment. Prod gets the same base deploy permissions (it must be able to apply CloudFormation and reach EKS) but cannot push images, delete stacks, or write artifacts. Updates to the template propagate to all accounts automatically.

10. What Changes When Octopus Runs Elsewhere?

The pattern is the same whether Octopus runs in EKS, ECS Fargate, EC2, or even on-premises. Only the Layer 1 (launcher identity mechanism) changes.

Octopus in ECS Fargate

No IRSA/Pod Identity (those are EKS features)

Instead: ECS Task Role

Task Role vs Execution Role -- They Are Different Things

ECS task definitions have two role fields, and confusing them is a common source of "why can't my container call AWS APIs" issues:

Task Role (taskRoleArn) -- This is the IAM role that your application code (Octopus/Calamari) uses at runtime to call AWS APIs. This is your launcher role. Credentials are injected into the container via the AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable, which points to the ECS agent's local metadata endpoint at http://169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI.
Execution Role (executionRoleArn) -- This is the IAM role that the ECS agent uses to pull your container image from ECR, send logs to CloudWatch, and retrieve secrets from Secrets Manager or SSM Parameter Store. Your application code never sees or uses this role. It needs ecr:GetAuthorizationToken, ecr:BatchGetImage, logs:CreateLogStream, logs:PutLogEvents, and optionally secretsmanager:GetSecretValue or ssm:GetParameters.

The key distinction: The execution role is for ECS infrastructure operations (pull image, push logs). The task role is for your application's AWS API calls. For the Octopus launcher pattern, only the task role matters -- it becomes the launcher identity that Calamari uses to assume deployment roles.

How ECS Credential Injection Works

When your ECS task starts, the ECS agent sets AWS_CONTAINER_CREDENTIALS_RELATIVE_URI in the container environment (e.g., /v2/credentials/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
The AWS SDK's credential chain detects this env var and makes an HTTP GET to http://169.254.170.2${AWS_CONTAINER_CREDENTIALS_RELATIVE_URI}
The ECS agent responds with temporary credentials (access key, secret key, session token) for the task role
These credentials auto-refresh -- the SDK handles rotation transparently

Note: The ECS metadata endpoint is at 169.254.170.2 (link-local), which is different from the EC2 IMDS at 169.254.169.254. If you have code that hardcodes the EC2 metadata IP, it won't work on Fargate.

Setup

Create IAM role with trust policy for ecs-tasks.amazonaws.com
Assign as the task role in ECS task definition
ECS injects credentials via AWS_CONTAINER_CREDENTIALS_RELATIVE_URI env var
Calamari picks this up via AWS SDK credential chain
Rest is identical: Calamari uses task role -> assumes deployment roles per step

Trust policy:

{
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Service": "ecs-tasks.amazonaws.com"
    },
    "Action": "sts:AssumeRole"
  }]
}

ECS task definition:

{
  "family": "octopus-server",
  "taskRoleArn": "arn:aws:iam::123456789012:role/OctoLauncherRole",
  "executionRoleArn": "arn:aws:iam::123456789012:role/OctopusECSExecutionRole",
  "containerDefinitions": [...]
}

ECS-Specific Gotchas

No 169.254.169.254 on Fargate: The EC2 instance metadata service (IMDS) is not available. If any library or script tries to hit the EC2 metadata endpoint, it will time out. Only the ECS credential endpoint at 169.254.170.2 is available.
VPC configuration matters: Fargate tasks need network access to STS (for AssumeRole calls). Either place tasks in a subnet with a NAT gateway, or create a VPC endpoint for com.amazonaws.<region>.sts.
Secrets in environment variables: Use the execution role + Secrets Manager/SSM integration to inject secrets into the container at launch, rather than baking them into the task definition. The execution role handles the secret retrieval before your container starts.

Octopus on EC2

Use EC2 Instance Role

Create IAM role with trust policy for ec2.amazonaws.com
Attach to EC2 instance via instance profile
Calamari picks up via IMDS (Instance Metadata Service) at http://169.254.169.254/latest/meta-data/iam/security-credentials/OctoLauncherRole
Rest is identical

Octopus On-Premises (No Ambient Credentials)

Option 1: Use Octopus OIDC - Octopus acts as OIDC issuer - Must expose /.well-known/openid-configuration publicly - Calamari uses Octopus-minted JWT -> AssumeRoleWithWebIdentity

Option 2: Use External Worker in AWS - Octopus Server on-prem orchestrates - Steps run on workers in AWS (EC2/ECS with instance/task roles) - Workers have launcher role, same pattern applies

11. The Conceptual Shift from Azure

This is the hardest part to internalize if you're coming from Azure:

Azure Mindset

"This pipeline runs as this service principal / managed identity; that principal has these permissions on these resources."

The identity is relatively static for the entire pipeline run. You might use multiple service connections, but each stage/job has one identity.

AWS + Octopus Mindset

"This step, at runtime, will assume this role in that account, do its work, and then the credentials expire."

The identity is dynamic per step. The Octopus pod/task has one minimal identity (launcher) that can't do anything itself -- it can only become other identities via AssumeRole.

Why This Is Powerful

Distributed runtime orchestration: - Octopus Server is pure orchestration -- no AWS permissions needed on the server itself - Calamari handles credential resolution and STS calls per step - Each step gets exactly the permissions it needs, no more - Cross-account is native -- no special configuration needed - Audit trail shows exact role -> role -> action chain

Fine-grained control: - Dev steps: broad permissions for fast iteration - Staging steps: similar to Dev, maybe with extra validations - Prod steps: read-only + manual approval gates - All from one Octopus installation with one pod identity

Defense in depth: - Pod compromise = attacker only has launcher role (can't touch resources) - Deployment role compromise = blast radius limited to that account/scope - Each AWS account owner controls their deployment role permissions - Centralized orchestration (Octopus) + distributed authorization (IAM per account)

12. Common Gotchas and How to Avoid Them

Gotcha 1: "My step says 'Access Denied' but the role has the permission"

Likely cause: You're looking at OctoLauncherRole permissions instead of the deployment role

Fix: Check CloudTrail in the target account to see which role the step actually assumed. Verify that role has the permission.

Gotcha 2: "AssumeRole fails with 'not authorized to perform sts:AssumeRole'"

Likely causes: 1. OctoLauncherRole doesn't have sts:AssumeRole permission for that target role ARN

2. Target role's trust policy doesn't allow OctoLauncherRole to assume it

3. Typo in role ARN

Fix: Check both the launcher role's permissions and the target role's trust policy.

Gotcha 3: "Step works in Dev but fails in Prod with same code"

Likely cause: ProdDeployRole has more restrictive permissions than DevDeployRole

Fix: This is by design. Check Prod role permissions and adjust or use manual approval + elevated role for Prod changes.

Gotcha 4: "IRSA/Pod Identity not working - Calamari can't find credentials"

Check: 1. Is the service account annotated correctly? (kubectl describe sa octopus-server -n octopus)

2. Are env vars injected in the pod? (kubectl exec -it <pod> -- env | grep AWS)

3. Is the pod using the right service account? (kubectl get pod <pod> -o yaml | grep serviceAccountName)

4. Does the IAM role trust policy match the exact service account and OIDC issuer?

Gotcha 5: "Cross-account AssumeRole works from AWS CLI but fails in Octopus"

Likely cause: External ID mismatch or session duration too long

Fix: - Don't use external IDs for Octopus role assumptions (not needed for service-to-service) - Check if target role has max session duration configured, ensure Octopus isn't requesting longer

Gotcha 6: "My deployment worked yesterday but fails today"

Likely cause: Temporary credentials expired and Calamari is reusing cached creds

Fix: This shouldn't happen -- Calamari calls STS per step. But check if you have any caching in custom scripts or environment variable exports that persist across steps.

13. Best Practices Summary

Security

Launcher role has zero resource permissions - only sts:AssumeRole
Deployment roles use least privilege - exactly what each environment needs
Prod roles are read-only by default - write access requires manual approval or separate role
Use Pod Identity over IRSA - simpler, more reliable in private clusters
Enable CloudTrail in all accounts - track the full role assumption chain
Create an STS VPC interface endpoint - deploy a com.amazonaws.<region>.sts VPC endpoint so that all AssumeRole and AssumeRoleWithWebIdentity calls stay within the AWS private network and never traverse the public internet. This is defense-in-depth for sensitive environments and eliminates the need for a NAT gateway for STS traffic:

aws ec2 create-vpc-endpoint \  
  --vpc-id vpc-xxxxxxxxx \
  --service-name com.amazonaws.us-east-1.sts \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-aaa subnet-bbb \
  --security-group-ids sg-0123456789abcdef0 \
  --private-dns-enabled

With --private-dns-enabled, the default sts.us-east-1.amazonaws.com hostname resolves to the private endpoint IP within your VPC. No SDK or application changes needed -- all STS calls automatically route privately.

Use AWS Organizations Service Control Policies (SCPs) to enforce that deployment roles can only be assumed by your specific launcher role ARN. SCPs act at the Organizations level and override even account-admin IAM policies, providing an organizational security boundary that complements the per-role trust policies:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Sid": "DenyAssumeDeployRolesExceptLauncher",
    "Effect": "Deny",
    "Action": "sts:AssumeRole",
    "Resource": [
      "arn:aws:iam::*:role/*DeployRole"
    ],
    "Condition": {
      "StringNotEquals": {
        "aws:PrincipalArn": "arn:aws:iam::111111111111:role/OctoLauncherRole"
      }
    }
  }]
}

This ensures that even if an account admin creates a permissive IAM policy, they cannot assume the deployment roles unless they are the designated launcher. Combine with trust policies for defense-in-depth.

SCP caveats: (1) SCPs do not apply to the management account in AWS Organizations -- if your launcher or deployment roles exist in the management account, this SCP has no effect there. Always place workloads in member accounts. (2) In a role chain (e.g., OctoLauncherRole assumes DevDeployRole), aws:PrincipalArn reflects the calling role at each hop, not the original initiator. If your deployment steps involve further role chaining beyond the two-layer pattern, the SCP condition behavior can be surprising -- test the exact evaluation in your role chain before relying on this SCP as a sole control.

Operational

Use StackSets to deploy roles - consistency across accounts, easy updates
One Octopus AWS Account per environment - Dev, Staging, Prod configs
Document role ARN in deployment process - make it clear which role each step uses
Use descriptive role session names - octopus-deploy-{deployment-id} helps in CloudTrail
Set reasonable session durations - default 1 hour is sensible for most steps; max is 12 hours but role chaining caps at 1 hour

Organizational

Each AWS account owner controls their deployment role - central Octopus, distributed authorization
Group microservices by blast radius - not every service needs its own role
Start simple, add granularity as needed - one role per environment, split later if needed
Use AWS Organizations - centralized billing, easier StackSet deployment

14. Quick Reference: Decision Trees

"Which launcher mechanism should I use?"

flowchart TD  
    A{"Where does\nOctopus run?"} -->|EKS| B{"Private cluster?"}
    B -->|Yes| C{"Pod Identity\navailable?"}
    C -->|Yes| D["EKS Pod Identity\n(recommended)"]
    C -->|No| E{"Can add Route 53\nresolver for OIDC?"}
    E -->|Yes| F["IRSA + Route 53\nresolver workaround"]
    E -->|No| G["Octopus K8s Agent\n(poll mode, no OIDC needed)"]
    B -->|No| H{"Existing OIDC\nprovider setup?"}
    H -->|Yes| I["IRSA\n(works fine)"]
    H -->|No| D
    A -->|ECS Fargate| J["ECS Task Role"]
    A -->|EC2| K["EC2 Instance Role\n(via Instance Profile)"]
    A -->|On-Premises| L{"Can expose public\nOIDC endpoint?"}
    L -->|Yes| M["Octopus OIDC"]
    L -->|No| N["External Workers in AWS\nor K8s Agent (poll mode)"]

"How many deployment roles do I need?"

flowchart TD  
    A{"How many\nenvironments?"} --> B["Dev + Staging + Prod\n= 3 base roles"]
    B --> C{"How many\nmicroservices?"}
    C -->|"< 5"| D["Shared role per env\n(3 total + 1 launcher = 4)"]
    C -->|"5-10"| E["Shared in Dev/Staging\nSplit Prod by risk\n(4-5 total)"]
    C -->|"> 10"| F["Role per service or\ngroup by domain\n(10-20 total)"]
    E --> G{"High-sensitivity\nworkloads?"}
    G -->|"Payments, PII, Auth"| H["Dedicated role +\nmanual approval"]
    G -->|"Analytics, Notifications"| I["Can share role"]

"My step is failing -- where do I look?"

flowchart TD  
    A["Step Failed"] --> B["Check Octopus\ndeployment log"]
    B --> C{"Shows which role\nwas assumed?"}
    C -->|Yes| D["Check CloudTrail\nin target account"]
    C -->|No| E["Credential resolution\nfailed - check IRSA/\nPod Identity setup"]
    D --> F{"AssumeRole\nsuccessful?"}
    F -->|No| G["Check trust policy +\nlauncher permissions"]
    F -->|Yes| H{"API call\nattempted?"}
    H -->|Yes| I{"AccessDenied?"}
    I -->|Yes| J["Check deployment\nrole permissions"]
    I -->|No| K["Different error -\ncheck API params"]
    H -->|No| L["Credential injection\nfailed - check Calamari logs"]

Conclusion

The AWS + Octopus + EKS pattern for multi-account deployments is more complex than Azure's managed identity model at first glance. But once you internalize the two-layer pattern -- launcher role + per-step deployment roles -- it becomes extremely powerful:

Octopus orchestrates, but never holds deployment permissions itself
Calamari resolves credentials dynamically per step via STS
IRSA/Pod Identity provides the bootstrap launcher identity
IAM roles per account encode exactly what each environment/service can do
Cross-account is native -- no special setup, just trust policies
Private clusters have options -- Pod Identity, Route 53 resolver workarounds, or the Octopus Kubernetes Agent in poll mode
STS VPC endpoints and SCPs add defense-in-depth at the network and organization level

The mental shift from "this pipeline runs as this identity" to "this step will assume this role at runtime" unlocks:

- Fine-grained, per-step authorization - Defense in depth (pod compromise != resource access) - Distributed ownership (each account controls its deployment role) - Centralized orchestration with decentralized permissions

Whether you use IRSA, Pod Identity, ECS Task Roles, EC2 instance roles, or the Octopus Kubernetes Agent as your launcher mechanism, the pattern remains the same. Master this model and multi-account, multi-environment AWS deployments become manageable, secure, and auditable.

Corrections

On april 1, 2026 the following corrections were applied based on technical review:

Section 6 diagram: Calamari processes were incorrectly shown inside the Dev and Prod account VPCs. Corrected to show them inside the Octopus pod in the Dev account (where Octopus runs), since Calamari runs as a subprocess of Octopus Server and makes remote API calls to target accounts via STS-assumed credentials.
Section 7 "EC2 instance" label: Added clarification that the "Execute using the AWS service role for an EC2 instance" option in Octopus UI is misleadingly named -- it actually means "use the SDK default credential chain" and works for IRSA, Pod Identity, and ECS Task Roles, not just EC2.
Section 4.2 Pod Identity mechanics: Tightened the credential delivery description. The Pod Identity Agent exposes a local endpoint at 169.254.170.23:80, credentials are discovered via AWS_CONTAINER_CREDENTIALS_FULL_URI env var, and the SDK handles everything transparently -- pods don't explicitly query the agent.
Section 13 SCP caveats: Added footnote that SCPs don't apply to the management account and that aws:PrincipalArn reflects the calling role at each hop in a role chain, which can produce surprising behavior beyond the two-layer pattern.
Section 10 ECS expansion: Added Task Role vs Execution Role distinction, explained AWS_CONTAINER_CREDENTIALS_RELATIVE_URI and the 169.254.170.2 metadata endpoint, clarified differences from EC2 IMDS (169.254.169.254), and added ECS-specific gotchas.

On march 31, 2026 the following corrections were applied:

STS session duration: Originally stated "15 minutes to 1 hour". Corrected to default 1 hour, configurable up to 12 hours. Role chaining is capped at 1 hour regardless of the role's max session duration setting. (AWS STS AssumeRole API Reference)
Octopus UI navigation path: Originally stated Infrastructure -> Accounts. Corrected to Deploy -> Manage -> Accounts per current Octopus Deploy documentation. (Octopus AWS Accounts docs)
Kubernetes Agent Helm chart: Originally used octopusdeploy/kubernetes-agent as a traditional Helm repo reference. Corrected to oci://registry-1.docker.io/octopusdeploy/kubernetes-agent which is the OCI registry path used in the official installation wizard. (Octopus Kubernetes Agent docs)

On April 2, 2026 the following corrections were applied based on technical review:

Account topology: Article originally assumed Octopus runs in a separate "Tooling Account" (123456789012). Corrected throughout to reflect that Octopus runs in the Dev account (111111111111). When deploying to Dev, the launcher and deployment role are in the same account (same-account AssumeRole). When promoting to Staging/Prod, those are separate AWS accounts requiring cross-account AssumeRole.
Section 5 intro (deployment process model): Original bullet list implied steps A/B/C/D each targeted different environments (Dev/Staging/Prod) sequentially within one deployment. This is incorrect. In Octopus Deploy, all steps in a single deployment execute against the same environment. Per-step role ARNs are for different permission scopes within the same environment (e.g., CloudFormation access vs ECR access), not for targeting different accounts sequentially. Environment promotion is what changes the target account.
Section 6 diagrams: Replaced single diagram (showing Calamari A hitting Dev and Calamari B hitting Prod simultaneously) with two diagrams: one showing same-account deployment to Dev, one showing cross-account promotion to Prod. Both show all steps targeting the same environment.
Section 7 (Octopus configuration): Rewrote to explain Octopus variable scoping as the mechanism for multi-account targeting. Instead of separate AWS Account configs per environment, a single AWS Account with an environment-scoped variable (#{AWS.DeployRoleArn}) resolves to the correct role ARN based on which environment the release is deployed to.
CloudFormation StackSets: Updated launcher role ARN references from Tooling account to Dev account.
Section 5.2 permission policies (all three environments): Replaced wildcard permissions (eks:*, ecr:*, cloudformation:*, s3:*) with specific least-privilege actions. Dev/Staging/Prod now show exact IAM actions required for each service (e.g., ecr:PutImage + ecr:InitiateLayerUpload instead of ecr:*), with resources scoped to the specific account.
Prod read-only contradiction: Previous version gave Prod only read permissions, which contradicts automated deployment. Fixed: Prod now has write permissions for CloudFormation and EKS access (necessary for deployment) but tightened controls: no ECR push (Prod pulls images built in Dev/Staging), no CloudFormation DeleteStack, S3 read-only. Safety comes from Octopus manual approval gates and namespace-scoped Kubernetes RBAC, not IAM read-only.
StackSet PowerUserAccess: Replaced arn:aws:iam::aws:policy/PowerUserAccess with parameterized inline policies using CloudFormation conditions (AllowECRPush, AllowCFNDelete). Prod gets restricted permissions via parameter overrides during stack instance creation.
New section 5.4 (kubectl/RBAC gap): Added explanation that IAM permissions alone do not grant Kubernetes API access. Deployment roles must also be mapped via EKS access entries (recommended) or the aws-auth ConfigMap. Includes examples for both mechanisms, namespace-scoped RBAC for Prod, and the two-layer authorization model (IAM → cluster, RBAC → namespaces).