Tuesday, April 13, 2021

AWS codebuild for scala, docker (ecr) CI

Problem and Audience

A continuous integration (CI) process that builds and tests our code, then publishes versioned deployable artifacts (docker images) is a prerequisite for deploying stable software services in the cloud. There are a wide variety of good, inexpensive CI services available, but we decided to build littleware's CI system on AWS codebuild, because it provides an easy to use serverless solution that supports the technology we build on (nodejs, java, scala, docker), and integrates well with AWS. It was straight forward for us to setup a codebuild CI process (buildspec.yml) for our little scala project given the tools we already have in place to deploy cloudformation stacks that define the codebuild project and ecr docker repository.

Overview

There were two steps to setting up our CI build: create the infrastructure, then debug and deploy the build script. The first step was easy, since we already have cloudformation templates for codebuild projects and ecr repositories that our little stack tool can deploy. For example, we deployed the codebuild project to build the littlware github repo by running:

little stack create ./stackParams.json

with this parameters file (stackParams.json):

{
    "StackName": "build-littleware",
    "Capabilities": [
        "CAPABILITY_NAMED_IAM"
    ],
    "TimeoutInMinutes": 10,
    "EnableTerminationProtection": true,
    "Parameters" : [
        {
            "ParameterKey": "PrivilegedMode",
            "ParameterValue": "true"
        },
        {
            "ParameterKey": "ProjectName",
            "ParameterValue": "cicd-littleware"
        },
        {
            "ParameterKey": "ServiceRole",
            "ParameterValue": "arn:aws:iam::027326493842:role/littleCodeBuild"
        },
        {
            "ParameterKey": "GithubRepo",
            "ParameterValue": "https://github.com/frickjack/littleware.git"
        }
    ],
    "Tags": [
            {
                "Key": "org",
                "Value": "applications"
            },
            {
                "Key": "project",
                "Value": "cicd-littleware"
            },
            {
                "Key": "stack",
                "Value": "frickjack.com"
            },
            {
                "Key": "stage",
                "Value": "dev"
            },
            {
              "Key": "role",
              "Value": "build"
            }
    ],
    "Littleware": {
        "TemplatePath": "lib/cloudformation/cicd/nodeBuild.json"
    }
}

With our infrastructure in place, we can add our build script to our github repository. There a few things to notice about our build script. First, the littleware git repo holds multiple interrelated projects - java and scala libraries and applications that build on top of them. We are currently interested in building and packaging the littleAudit/ folder (that will probably be renamed), so the build begins by moving to that folder:

  build:
    commands:
      - cd littleAudit

Next, we setup our codebuild project to run the build container in privileged mode, so our build can start a docker daemon, and build docker images:

phases:
  install:
    runtime-versions:
      # see https://github.com/aws/aws-codebuild-docker-images/blob/master/ubuntu/standard/5.0/Dockerfile
      java: corretto11
    commands:
      # see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-codebuild-project-environment.html#cfn-codebuild-project-environment-privilegedmode
      - nohup /usr/local/bin/dockerd --host=unix:///var/run/docker.sock --host=tcp://0.0.0.0:2375 --storage-driver=overlay &
      - timeout 15 sh -c "until docker info; do echo .; sleep 1; done"

We use gradle to compile our code and run the unit test suite. The org.owasp.dependencycheck gradle plugin adds a dependencyCheckAnalyze task that checks our maven dependencies against public databases of known vulnerabilities:

  build:
    commands:
      - cd littleAudit
      - gradle build
      - gradle dependencyCheckAnalyze
      - docker build -t codebuild:frickjack .

Finally, our post-build command tags and pushes the docker image to an ecr repository. The tagging rules align with the lifecycle rules on the repository (described here and here).

  post_build:
    commands:
      - BUILD_TYPE="$(echo "$CODEBUILD_WEBHOOK_TRIGGER" | awk -F / '{ print $1 }')"
      - echo "BUILD_TYPE is $BUILD_TYPE"
      - |
        (
          little() {
              bash "$CODEBUILD_SRC_DIR_HELPERS/AWS/little.sh" "$@"
          }

          scanresult=""
          scan_in_progress() {
            local image
            image="$1"
            if ! shift; then
                echo "invalid scan image"
                exit 1
            fi
            local tag
            local repo
            tag="$(echo "$image" | awk -F : '{ print $2 }')"
            repo="$(echo "$image" | awk -F : '{ print $1 }' | cut -d / -f 2-)"
            scanresult="$(little ecr scanreport "$repo" "$tag")"
            test "$(echo "$scanresult" | jq -e -r .imageScanStatus.status)" = IN_PROGRESS
          }

          TAGSUFFIX="$(echo "$CODEBUILD_WEBHOOK_TRIGGER" | awk -F / '{ suff=$2; gsub(/[ @/]+/, "_", suff); print suff }')"
          LITTLE_REPO_NAME=little/session_mgr
          LITTLE_DOCKER_REG="$(little ecr registry)" || exit 1
          LITTLE_DOCKER_REPO="${LITTLE_DOCKER_REG}/${LITTLE_REPO_NAME}"

          little ecr login || exit 1
          if test "$BUILD_TYPE" = pr; then
            TAGNAME="${LITTLE_DOCKER_REPO}:gitpr_${TAGSUFFIX}"
            docker tag codebuild:frickjack "$TAGNAME"
            docker push "$TAGNAME"
          elif test "$BUILD_TYPE" = branch; then
            TAGNAME="${LITTLE_DOCKER_REPO}:gitbranch_${TAGSUFFIX}"
            docker tag codebuild:frickjack "$TAGNAME"
            docker push "$TAGNAME"
          elif test "$BUILD_TYPE" = tag \
            && (echo "$TAGSUFFIX" | grep -E '^[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}$' > /dev/null); then
            # semver tag
            TAGNAME="${LITTLE_DOCKER_REPO}:gitbranch_${TAGSUFFIX}"
            if ! docker tag codebuild:frickjack "$TAGNAME"; then
              echo "ERROR: failed to tag image with $TAGNAME"
              exit 1
            fi
            ...

If the CI build was triggered by a semver git tag, then it waits for the ecr image scan to complete successfully before tagging the docker image for production use:

       ...
          elif test "$BUILD_TYPE" = tag \
            && (echo "$TAGSUFFIX" | grep -E '^[0-9]{1,}\.[0-9]{1,}\.[0-9]{1,}$' > /dev/null); then
            # semver tag
            TAGNAME="${LITTLE_DOCKER_REPO}:gitbranch_${TAGSUFFIX}"
            if ! docker tag codebuild:frickjack "$TAGNAME"; then
              echo "ERROR: failed to tag image with $TAGNAME"
              exit 1
            fi
            # see https://docs.aws.amazon.com/AmazonECR/latest/APIReference/API_ImageScanStatus.html
            docker push "$TAGNAME" || exit 1
            count=0
            sleep 10

            while scan_in_progress "$TAGNAME" && test "$count" -lt 50; do
              echo "Waiting for security scan - sleep 10"
              count=$((count + 1))
              sleep 10
            done
            echo "Got image scan result: $scanresult"
            if ! test "$(echo "$scanresult" | jq -e -r .imageScanStatus.status)" = COMPLETE \
               || ! test "$(echo "$scanresult" | jq -e -r '.imageScanFindingsSummary.findingSeverityCounts.HIGH // 0')" = 0 \
               || ! test "$(echo "$scanresult" | jq -e -r '.imageScanFindingsSummary.findingSeverityCounts.CRITICAL // 0')" = 0; then
               echo "Image $TAGNAME failed security scan - bailing out"
               exit 1
            fi
            SEMVER="${LITTLE_DOCKER_REPO}:${TAGSUFFIX}"
            docker tag "$TAGNAME" "$SEMVER"
            docker push "$SEMVER"
          else
            echo "No docker publish for build: $BUILD_TYPE $TAGSUFFIX"
          fi

Summary

A continuous integration (CI) process that builds and tests our code, then publishes versioned deployable artifacts (docker images) is a prerequisite for deploying stable software services in the cloud. Our codebuild CI project builds and publishes the docker images that we will use to deploy our "little session manager" service as a lambda behind an API gateway (but we're still working on that).

Friday, April 09, 2021

Setup ECR on AWS

Problem and Audience

Setting up an ECR repository for publishing docker images is a good first step toward deploying a docker-packaged application on AWS (ECS, EKS, EC2, lambda, ...). Although we use ECR like any other docker registry, there are a few optimizations we can make when setting up a repository.

Overview

We should consider the workflow around the creation and use of our Docker images to decide who we should allow to create a new ECR repository, and who should push images to ECR. In a typical docker workflow a developer publishes a Dockerfile alongside her code, and a continuous integration (CI) process kicks in to build and publish the Docker image. When the new image passes all its tests and is ready for release, then the developer (or some other process) adds a semver (or some other standard) release tag to the image. All this development, test, and publishing takes place in an AWS account assigned to the developer team linked with the docker image; but the release-tagged images are available for use (docker pull) in production accounts.

With the above workflow in mind, we updated the cloudformation templates we use to setup our user (admin, dev, operator) and codebuild (CI) IAM roles to grant full ECR access in our developer account (currently we only have a dev account).

Next we developed a cloudformation template for creating ECR repositories in our dev account. Our template extends the standard cloudformation syntax with nunjucks tags supported by our little stack tools. We also developed a little ecr tool to simplify some common tasks.

There are a few things to notice in the cloudformation template. First, each repository has an IAM resource policy that allows our production AWS accounts to pull images from ECR repositories in our dev accounts:

"RepositoryPolicyText" : {
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "AllowCrossAccountPull",
            "Effect": "Allow",
            "Principal": {
                "AWS": { "Ref": "ReaderAccountList" }
            },
            "Action": [
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage"
            ]
        }
    ]
},

Second, each repository has a lifecycle policy that expires non-production images. This is especially important for ECR, because ECR storage costs ten cents per GByte/month, and Docker images can be large.

{
  "rules": [
    {
      "rulePriority": 1,
      "description": "age out git dev tags",
      "selection": {
        "tagStatus": "tagged",
        "tagPrefixList": [
          "gitsha_",
          "gitbranch_",
          "gitpr_"
        ],
        "countType": "sinceImagePushed",
        "countUnit": "days",
        "countNumber": 7
      },
      "action": {
        "type": "expire"
      }
    },
    {
      "rulePriority": 2,
      "description": "age out untagged images",
      "selection": {
        "tagStatus": "untagged",
        "countType": "imageCountMoreThan",
        "countNumber": 5
      },
      "action": {
        "type": "expire"
      }
    }
  ]
}

Finally, we configure ECR to scan our images for known security vulnerabilities on push. Our little ecr scanreport tool retrieves an image's scan-results from the command line. The workflow that tags an image for production should include a step that verifies that the image is free from vulnerabilities more severe than whatever policy we want to enforce.

Summary

Although we use ECR like any other docker registry, there are a few optimizations we can make when setting up a repository. First, we update our IAM policies to give users and CICD pipelines the access they need to support our development and deployment processes. Next, we add resource policies to our ECR repositories to allow production accounts to pull docker images from repositories in developer accounts. Third, we attach lifecycle rules to each repository to avoid the expense of storing unused images. Finally, we enable image scanning on push, and check an image's vulnerability report before tagging it for production use.

Friday, April 02, 2021

Sign and Verify JWT with ES256

Problem and Audience

A developer of a system that uses json web tokens (JWT) to authenticate HTTP API requests needs to generate asymmetric cryptographic keys, load the keys into code, then use the keys to sign and validate tokens.

We are building a multi-tenant system that implements a hierarchy where each tenant (project) may enable one or more api's. An end user authenticates with the global system (OIDC authentication client) via a handshake with a Cognito identity provider, then acquires a short lived session token to interact with a particular api under a particular project (OIDC resource server).

It would be nice to simply implement the session token as a Cognito OIDC access token, but our system has a few requirements that push us to manage our own session tokens for now. First, each (project, api) pair is effectively an OIDC resource server in our model, and projects and api's are created dynamically, so managing the custom JWT claims with Cognito resource servers would be messy.

Second, we want to be able to support robot accounts at the project level, and a Cognito mechanism to easily provision robot accounts and tokens is not obvious to us. So we decided to manage our "session" tokens in the application, and rely on Cognito to federate identity providers for user authentication.

JWTS with ES256

I know very little about cryptography, authentication, and authorization; but fortunately people that know more than me share their knowledge online. Scott Brady's bLog gives a nice overview of JWT signing. We want to sign and verify JWTs in scala using the elliptic curve ES256 algorithm - which improves on RSA256 in a few ways, and is widely supported.

There are different ways to generate an elliptic curve sha-256 key pair, but EC keys saved to pem files are supported by multiple tools, and are easy to save to configuration stores like AWS SSM parameter store or secrets manager.

This bash function uses openssl to generate keys in pem files.

#
# Bash function to generate new ES256 key pair
#
newkey() {
    local kid=${1:-$(date +%Y%m)}
    local secretsFolder=$HOME/Secrets/littleAudit

    (
        mkdir -p "$secretsFolder"
        cd "$secretsFolder" || return 1
        if [[ ! -f ec256-key-${kid}.pem ]]; then
          openssl ecparam -genkey -name prime256v1 -noout -out ec256-key-${kid}.pem
        fi
        # convert the key to pkcs8 format
        openssl pkcs8 -topk8 -nocrypt -in ec256-key-${kid}.pem -out ec256-pkcs8-key-${kid}.pem
        # extract the public key
        openssl ec -in ec256-pkcs8-key-${kid}.pem -pubout -out ec256-pubkey-${kid}.pem
    )
}

Load the keys into code

Now that we have our keys - we need to load them into our scala application.

class KeyHelper @inject.Inject() (
  gs: gson.Gson, 
  ecKeyFactory:KeyHelper.EcKeyFactory, 
  rsaKeyFactory:KeyHelper.RsaKeyFactory
  ) {    
    /**
     * @return pem input with pem file prefix/suffix and empty space removed
     */
    def decodePem(pem:String): String = {
      pem.replaceAll(raw"-----[\w ]+-----", "").replaceAll("\\s+", "")
    }


    def loadPublicKey(kid:String, pemStr:String):SessionMgr.PublicKeyInfo = {
      val key = ecKeyFactory.generatePublic(decodePem(pemStr))
      SessionMgr.PublicKeyInfo(kid, "ES256", key)
    }


    def loadPrivateKey(kid:String, pemStr:String):SessionMgr.PrivateKeyInfo = {
      val key = ecKeyFactory.generatePrivate(decodePem(pemStr))
      SessionMgr.PrivateKeyInfo(kid, "ES256", key)
    }

    /**
     * Load keys from a jwks url like 
     *    https://www.googleapis.com/oauth2/v3/certs
     */
    def loadJwksKeys(jwksUrl:java.net.URL): Set[SessionMgr.PublicKeyInfo] = {
      val jwksStr = {
        val connection = jwksUrl.openConnection()
        connection.setRequestProperty("Accept-Charset", KeyHelper.utf8)
        connection.setRequestProperty("Accept", "application/json")
        val response = new java.io.BufferedReader(new java.io.InputStreamReader(connection.getInputStream(), KeyHelper.utf8))
        try {
            littleware.base.Whatever.get().readAll(response)
        } finally {
            response.close()
        }
      }

      gs.fromJson(jwksStr, classOf[gson.JsonObject]).getAsJsonArray("keys").asScala.map(
          { 
            json:gson.JsonElement =>
            val jsKeyInfo = json.getAsJsonObject()
            val kid = jsKeyInfo.getAsJsonPrimitive("kid").getAsString()
            val n = jsKeyInfo.getAsJsonPrimitive("n").getAsString()
            val e = jsKeyInfo.getAsJsonPrimitive("e").getAsString()
            val pubKey = rsaKeyFactory.generatePublic(n, e)
            SessionMgr.PublicKeyInfo(kid, "RSA256", pubKey)
          }
      ).toSet 
    }
}

object KeyHelper {
    val utf8 = "UTF-8"

    /**
     * Little injectable key factory hard wired to use X509 key spec for public key
     */
    class EcKeyFactory {
        val keyFactory = java.security.KeyFactory.getInstance("EC")
        val b64Decoder = java.util.Base64.getDecoder()

        def generatePublic(base64:String):ECPublicKey = {
            val bytes = b64Decoder.decode(base64.getBytes(utf8))
            val spec = new X509EncodedKeySpec(bytes)

            keyFactory.generatePublic(spec).asInstanceOf[ECPublicKey]
        }

        def generatePrivate(base64:String):ECPrivateKey = {
            val bytes = b64Decoder.decode(base64.getBytes(utf8))
            val spec = new PKCS8EncodedKeySpec(bytes)

            keyFactory.generatePrivate(spec).asInstanceOf[ECPrivateKey]
       }
    }

    /**
     * Little injectable key factory hard wired for RSA jwks decoding
     * See: https://github.com/auth0/jwks-rsa-java/blob/master/src/main/java/com/auth0/jwk/Jwk.java
     */
    class RsaKeyFactory {
        private val keyFactory = java.security.KeyFactory.getInstance("RSA")
        private val b64Decoder = java.util.Base64.getUrlDecoder()

        def generatePublic(n:String, e:String):RSAPublicKey = {
            val modulus = new java.math.BigInteger(1, b64Decoder.decode(n))
            val exponent = new java.math.BigInteger(1, b64Decoder.decode(e))
            keyFactory.generatePublic(new RSAPublicKeySpec(modulus, exponent)).asInstanceOf[RSAPublicKey]
        }
    }
}

Sign and verify JWTs

Now that we have loaded our keys, we can use them to sign and verify JWTs. Okta has published open source code for working with JWTs , Auth0 has published open source code for working with JWK, and AWS KMS supports elliptic curve digital signing algorithms with asymmetric keys.

import com.google.inject
// see https://github.com/jwtk/jjwt#java-jwt-json-web-token-for-java-and-android
import io.{jsonwebtoken => jwt}
import java.security.{ Key, PublicKey }
import java.util.UUID
import scala.util.Try

import littleware.cloudmgr.service.SessionMgr
import littleware.cloudmgr.service.SessionMgr.InvalidTokenException
import littleware.cloudmgr.service.littleModule
import littleware.cloudutil.{ LRN, Session }

/**
 * @param signingKey for signing new session tokens
 * @param verifyKeys for verifying the signature of session tokens
 */
@inject.ProvidedBy(classOf[LocalKeySessionMgr.Provider])
@inject.Singleton()
class LocalKeySessionMgr (
    signingKey: Option[SessionMgr.PrivateKeyInfo],
    sessionKeys: Set[SessionMgr.PublicKeyInfo],
    oidcKeys: Set[SessionMgr.PublicKeyInfo],
    issuer:String,
    sessionFactory:inject.Provider[Session.Builder]
    ) extends SessionMgr {

    val resolver = new jwt.SigningKeyResolverAdapter() {
        override def resolveSigningKey(jwsHeader:jwt.JwsHeader[T] forSome { type T <: jwt.JwsHeader[T] }, claims:jwt.Claims):java.security.Key = {
            val kid = jwsHeader.getKeyId()
            (
                {
                    if (claims.getIssuer() == issuer) {
                        sessionKeys
                    } else {
                        oidcKeys
                    }
                }
            ).find(
                { it => it.kid == kid }
            ).map(
                { _.pubKey }
            ) getOrElse {
                throw new SessionMgr.InvalidTokenException(s"invalid auth kid ${kid}")
            }
        }
    }

    ...

    def jwsToClaims(jwsIdToken:String):Try[jwt.Claims] = Try(
        { 
            jwt.Jwts.parserBuilder(
            ).setSigningKeyResolver(resolver
            ).build(
            ).parseClaimsJws(jwsIdToken
            ).getBody()
        }
    ).flatMap( claims => Try( {
                    Seq("email", jwt.Claims.EXPIRATION, jwt.Claims.ISSUER, jwt.Claims.ISSUED_AT, jwt.Claims.AUDIENCE).foreach({
                        key =>
                        if(claims.get(key) == null) {
                            throw new InvalidTokenException(s"missing ${key} claim")
                        }
                    })
                    claims
                }
            )
    ).flatMap(
        claims => Try(
            {
                if (claims.getExpiration().before(new java.util.Date())) {
                    throw new InvalidTokenException(s"auth token expired: ${claims.getExpiration()}")
                }
                claims
            }
        )
    )


    def sessionToJws(session:Session):String = {
        val signingInfo = signingKey getOrElse { throw new UnsupportedOperationException("signing key not available") }
        jwt.Jwts.builder(
        ).setHeaderParam(jwt.JwsHeader.KEY_ID, signingInfo.kid
        ).setClaims(SessionMgr.sessionToClaims(session)
        ).signWith(signingInfo.privKey
        ).compact()
    }

    def jwsToSession(jws:String):Try[Session] = jwsToClaims(jws
        ) map { claims => SessionMgr.claimsToSession(claims) }

    ...
}

This code is all online under this github repo, but is in a state of flux.

Summary

To sign and verify JWTs we need to generate keys, load the keys into code, and use the keys to sign and verify tokens. We plan to add support for token signing with AWS KMS soon.