2025-12-01, 12 minutes reading for Software Engineers
I carried out this brief Analysis of Using Action Runner Controller Compared to Self-Hosted Runner in July of 2024, This year my current Company has adopted it. This blog post give an overview of using k8s instead of dedicated EC2 runners.
This post does not go through the setup process, the documentation is easy to follow.
Actions Runner Controller (ARC) is a Kubernetes operator that orchestrates and scales self-hosted runners for GitHub Actions.
With ARC, you can create runner scale sets that automatically scale based on the number of workflows running in your repository, organization, or enterprise. Because controlled runners can be ephemeral and based on containers, new runner instances can scale up or down rapidly and cleanly.
Our AWS EC2 GitHub runners were not efficiently utilised and had no automatic scaling.
Most Jobs fall under 10 minutes, however the runner keep running regardless
And most of the workflows runs from 9am to 7pm (Australian main time zones)
Controller: A single one for the whole deployment, watches for changes, auto-scales, and manages the lifecycle of runners.
Listener: One for each runnerSet, the listener pod connects to the GitHub Actions Service to authenticate and establish an HTTPS long poll connection that handles communication with GitHub. The listener stays idle until it receives a Job Available message from the GitHub Actions Service.
Runner: Represents a single Action runner.
RunnerSet: Runner scale sets are a group of homogeneous runners (that share the same configuration) that can be assigned jobs from GitHub Actions.
resource.request and resource.limit), workflow runs might be less predictable, potentially taking more or less time depending on the capacity of the cluster.
More about ARC can be found on docs.github.com
ARC consists of several custom resource definitions (CRDs). Once deployed, you can list these custom resources with:
➜ ~ kubectl api-resources --api-group=actions.github.com
NAME SHORTNAMES APIVERSION NAMESPACED KIND
autoscalinglisteners actions.github.com/v1alpha1 true AutoscalingListener
autoscalingrunnersets actions.github.com/v1alpha1 true AutoscalingRunnerSet
ephemeralrunners actions.github.com/v1alpha1 true EphemeralRunner
ephemeralrunnersets actions.github.com/v1alpha1 true EphemeralRunnerSet
The runner maintains a outbound HTTPS connection (long-polling) to GitHub using the runner container hook. This means that the runner is constantly checking (polling) GitHub for new jobs.
Once the runner starts executing a job, it continuously sends updates back to GitHub over the same long-polling connection. These updates include the status of the job (e.g., in progress, completed, failed), any output from the job (which is displayed in the GitHub Actions logs), and any artifacts produced by the job.
ARC officially supports three modes. You are free to customise the runner image however you wish (such as having rootless DinD) as long as it meets the minimum requirements.
Nothing special; a single container in a single pod. It does not come with Docker.
Docker in Docker (DinD): Run DinD container alongside the runner container in the same pod.
The runner is responsible for executing all commands passed through the actions, while the Docker container runs the Docker daemon. The runner container mounts the Docker daemon socket, so even though the docker run command is executed from the runner container, that request is ultimately passed through to the Docker daemon running in the Docker container.
Resource requests and limits need to be carefully adjusted for both containers. You'll need to consider where the workloads are ultimately going to be run. If the majority of the workload is within the container, then you'd want to allocate the majority of resources to the DinD container.
If security is your highest priority, then Kubernetes mode allows you to do the same as DinD. In K8s mode, a second pod is spun in the same namespace with the pod sharing the same network space and persistent volume as the runner, without requiring any privileges.
To use Kubernetes mode, you must:
Unfortunately, due to resource limitation, I was unable to test K8s Mode.
Some test cases I created, these test cases should be representative of some/most of the workflows we run on a daily basis. Stress testing ARC was out of scope:
services keyword, requires DinDcontainer keywordARC was able to handle all these normally with no issue.
In this section, I estimated the cost comparison between current EC2 Runners and ARC.
The analysis here makes the following assumptions:
| Instance Name | vCPUs | Memory |
|---|---|---|
| c5.2xlarge | 8 | 16 GiB |
Cost of the current EC2 runners is calculated as such:
cost of EC2 type (c5.2xlarge) per hour multiplied by the combined total number of hours.
Our EC2 runners are scaled up and down on schedule regardless of daily demand. That is, the same number of runners are active every week.
Table below shows the cost breakdown for our Build runners of type c5.2xlarge using RHEL in Asia-Pacific Sydney region:
| Cost per Hour (RHEL) | Total Hours per Day | Daily Cost | Annual Cost (260 days) |
|---|---|---|---|
| $0.574 | 10 nodes × 19 hrs + 5 nodes × 4 hrs = 210 hrs | $120.54 | $31,340.40 |
There are two methods that can be used to estimate this:
Tables below show the estimated cost breakdown for our equivalent Build runners of type c5.2xlarge using Amazon EKS optimised Amazon Linux in Asia-Pacific Sydney region:
With EKS, you only pay for what you use; there are no minimum fees and no upfront commitments.
Because we only pay what we use, the cost in EKS is dynamic, based on how many nodes are running at any moment in time.
Here is the cost breakdown:
The graph below shows the CPU usage of 10 Build Runners on Thursday, July 4, 2024:
The graph below shows the same CPU usage but with stacked average area of 10 Build Runners on Thursday, July 4, 2024. Note the maximum possible value for 10 instances is 1000%:
The graph below shows the same CPU usage but with stacked maximum area of 10 Build Runners on Thursday, July 4, 2024:
Based on the graphs above and current EKS requirements, my calculations estimate are based on these assumptions:
➜ kubectl get daemonset --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system aws-node 7 7 7 7 7 <none> 2y141d
kube-system kube-proxy 7 7 7 7 7 <none> 2y85d
splunk splunk-otel-splunk-otel-collector-agent 7 7 7 7 7 kubernetes.io/os=linux 294d
twistlock twistlock-defender-ds 7 7 7 7 7 <none> 2y129d
In out case these DaemonSets take up at most 1 core on each node.
Note:
Table below shows the estimated cost of using the CloudWatch method:
| Cost per Hour | Total Hours per Day | Daily Cost | Annual Cost (260 days) |
|---|---|---|---|
| $0.444 | 60 × 40% = 84 | $37.296 | $9,696.96 |
Using GitHub workflow statistics for our Github org for the month of June 2024, this data includes build, non-prod, and deploy runners. I extracted and plotted the data into a frequency distribution graph. I wanted to highlight our current workflow usage.
We have used a total of 124,292 minutes, however this includes 5 non-prod instances and 3 deploy instances (in practice, the number of instances is usually higher). In my calculation, I only include the 10 instances of build runners, which translates to 60% of 124,292 minutes, which is equal to 74,575 minutes.
We have used a total of 74,575 minutes = 1,242.92 hours = 51.79 days of workflow time. Which is equivalent to running ~2 instances for 24 hours, or ~4 for 12 hours. We add 40% to our result to account for reduced capacity and idling, and we reach 6 instances for 12 hours, which aligns with my generous calculations in the CloudWatch section.
Table below shows the estimated cost of using the GitHub Statistics method:
| Cost per Hour | Total Hours per Day | Daily Cost | Annual Cost (260 days) |
|---|---|---|---|
| $0.444 | 6 × 12 = 72 | $31.968 | $8,311.68 |
The performance per core was roughly the same, however keep in mind that unlike dedicated EC2, ARC Runner nodes have slightly reduced capacity due to the daemon-sets running on each instance and pod overhead is larger than process overhead when running on a dedicated runner.
The workflow below was used as a benchmark for both runners. It builds a buildpack image, then runs the build image to download and compile CMake. The compilation uses all available cores.
The result of running the same benchmark on both runners are shown in the table below.
| vCPUs (C5 family) | GitHub Runner | ARC |
|---|---|---|
| 2 | 33m 52s | 34m 17s |
| 4 | 23m 31s | 23m 24s |
| 8 | 12m 44s | 12m 51s |
Is your workflow's CPU utilisation like this?
Most Workflows jobs make little use of multi-threading, so having exclusive ownership of a dedicated EC2 Runners will result in poor utilization of resources when your workflow is not using the cores. This can be clearly observed from the CPU utilization graph. So a good question to ask:
Do you really need 8 cores system to run terraform apply? when Terraform apply is API limited in the first place( --parallelism flag refers to concurrent operations and not actual parallelism, i.e. multi-threading or multiprocessing).
I couldn't confirm whether a Terraform run (apply/plan/destroy) uses more than one thread, however I found this about the required CPU resources for terraform enterprise on capacity and performance:
Our rule of thumb is 10 Terraform runs per CPU core, with 2 CPU cores allocated for the base Terraform Enterprise services. So a 4-core instance with 16 GB of memory could comfortably run 20 Terraform runs, if the runs are allocated the default 512 MB each.
In other words, a single run requires 0.2c CPU and 512MB
Runners can be broken into runner groups, such that each workflow only uses what it requires, decreasing node provisioning. for example we could configure these runner groups:
runner-1c-2mrunner-2c-4mrunner-4c-8mrunner-8c-16m Since we are paying by the minute, further cost cutting could be achieved by optimising CI/CD, using parallel runs instead of sequential where possible.