2025-01-11, 7 minutes reading for Software Engineers
Just don’t use it, because of these problems:
Docker custom caching is a feature that allows you to cache the layers of your Docker images to a custom location. This can be useful to speed up the build process of your Docker images.
Solid Understanding of:
Docker offers two caching modes max and min:
In min cache mode (the default), only layers that are exported into the resulting image are cached, while in max cache mode, all layers are cached, even those of intermediate steps.
While min cache is typically smaller (which speeds up import/export times, and reduces storage costs), max cache is more likely to get more cache hits. Depending on the complexity and location of your build, you should experiment with both parameters to find the results that work best for you.
In this post, I'm using the max mode.
There are 2 ways to create this:
Due to environment limitation, I to choose the manual option.
The application that I wanted to cache consisted of multiple images that need to be built and tested
If you want to cache every image, this can be done with cache_to
and cache_from
options, this command should be invoked like so in your CI:
docker buildx bake
-f docker-compose.yml
-f docker-compose.dev.yml
--load
--set *.cache_from=type=local,src=./tmp/.buildx-cache
--set *.cache_to=type=local,dest=./tmp/.buildx-cache
However, if you want control over which services to cache, then include these in your docker-compose file:
cache_from:
- type=local,src=./tmp/.buildx-cache-restored
cache_to:
- type=local,dest=./tmp/.buildx-cache,modes=max
services:
frontend:
build:
target: builder
cache_from:
- type=local,src=./tmp/.buildx-cache-restored
cache_to:
- type=local,dest=./tmp/.buildx-cache,modes=max
command: npm run dev
...
api:
build:
args:
- NODE_ENV=development
target: builder
cache_from:
- type=local,src=./tmp/.buildx-cache-restored
cache_to:
- type=local,dest=./tmp/.buildx-cache,modes=max
...
Required GitHub Action setup
docker/setup-buildx-action@v3
actions/cache/restore@v4
docker buildx bake -f docker-compose.yml --load --provenance=false
actions/cache/save@v4
name: Docker Caching
on:
pull_request:
jobs:
tests:
runs-on: self-hosted # your runner name here
steps:
- name: Checkout
uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- name: Set up Docker build cache
id: cache-docker-layer
uses: actions/cache/restore@v4
with:
path: ./tmp/.buildx-cache
key: ${{ runner.os }}-docker-${{ github.sha }}
restore-keys: |
${{ runner.os }}-docker-
- name: move cache to buildx-cache-restored
if: steps.cache-docker-layer.outputs.cache-hit == 'true'
run: mv ./tmp/.buildx-cache ./tmp/.buildx-cache-restored || true
- name: ls ./tmp/.buildx-cache-restored
if: steps.cache-docker-layer.outputs.cache-hit == 'true'
run: ls ./tmp/.buildx-cache-restored || true
- name: create dirs for cache if no cache hit
if: steps.cache-docker-layer.outputs.cache-hit == 'false'
run: |
mkdir -p ./tmp/.buildx-cache-restored || true
chmod -R 777 ./tmp/.buildx-cache-restored || true
mkdir -p ./tmp/.buildx-cache || true
chmod -R 777 ./tmp/.buildx-cache || true
- name: Build images
# using multiple -f docker-compose to add and overwrote the base docker-compose
run: docker buildx bake -f docker-compose.base.yml -f docker-compose.e2e.yml -f docker-compose.ci.yml --load --provenance=false
- name: Run tests here
run: echo "Running tests..."
- name: ls ./tmp/.buildx-cache
run: ls ./tmp/.buildx-cache
- name: Cache Docker layers
if: always()
uses: actions/cache/save@v4
with:
path: ./tmp/.buildx-cache
key: ${{ runner.os }}-docker-${{ github.sha }}
- name: Cleanup containers
if: always()
run: |
rm -rf ./tmp/.buildx-cache || true
rm -rf ./tmp/.buildx-cache-restored || true
The transfer speed on cache hit for ~1GB...
As you can see from the image above, the Network Performance:
Impact: Cache hits on self-hosted runners have a transfer speed that is 1/4 that of GitHub runners.
Despite only caching ~1GB worth of layers on GitHub, the runtime was hit and miss(pun intended).
I suspect this is because the application dependencies(such as react) are not cached on the runner, each build will download a different variation of dependencies(there is no guarantee that react 17.1.2 that was downloaded 3 hours ago is the same as react 17.1.2 one downloaded today unless there is a hash), meaning that the cache is invalidated for the “npm install” layer, this will also invalidate all subsequent layers.
However, Seeing my local builds were always faster, I decided to reuse a runner instead of letting it be get recycled[3], this way I would be depending on docker default caching, doing so decreased run time by more than 50% to only 2-3 mins.
Given that this job is trigger many times a day, the mean time will approach 3 mins without any manual caching, which is good enough. However runner reuse comes with security risk, which I ultimately abandoned.
I haven't looked into why default docker caching produced more consistent builds compared to explicit caching, i.e. what guarantees are made that the install dependencies step gets reused. This would be interesting to investigate in the future.
Just don't use it. I offer two alternative:
Runner reuse(warm cache) is more effective than relying on GitHub caching for this use case, if your security team allows it.
If you still want to do caching on Self-Hosted runners, you should look into GitHub Actions Cache Server
This probably varies by region, cloud provider, whether it goes over the public internet and other factors, you might get better results if you are using Azure based runners↩
Sometimes it just stalls according to this discussion on GitHub community↩
Not a good practise, runners should be ephemeral↩