How AWS CodeArtifact streamlined our packaging and distribution

Ashish Kaushik
5 min readJun 27, 2021

--

AWS CodeArtifact, a fully managed artifact repository service, to store and share the packages used for the application development. It can be used with the package managers like NuGet, Maven, Gradle, npm, yarn, pip, and twine.

How does it work?

It is important to take note of this little fact that the Domain aggregates all repositories. So, we are consuming the package from the repository, or even multiple repositories, but in fact, this package is stored only once across the Domain.

CodeArtifact Component Hierarchy

To better understand the Component hierarchy within CodeArtifact, read below:

AWS CodeArtifact basic Hierarchy
  • Software packages are hosted in a “Private Repository”.
  • “Repository” is a part of a Domain that contains a set of Packages.
  • A Domain is the collection of Repositories that can span across multiple AWS Accounts within the same Organisation.

Diving Deeper

For us, the existing process was to consume the code packages directly from public repositories, with multiple language types. This has two issues:

  • No granular control over who can publish new packages in our production code — Any developer can publish the packages without going through any authenticity checks for the package.
  • No oversight on what packages can be utilized from the public domain — This has led to legal issues due to the Open-Source license type the package belongs to.

To understand AWS CodeArtifact better we did a POC which also helped us to finalize the release workflow.

PoC Findings

  1. An Artifact repository can have upstream repositories defined which will allow us to download and keep the package only once across the Domain.
  2. The repositories of the different domains cannot be linked to each other as an upstream repository.
  3. One repository can have multiple upstream repositories and the ‘fetch from’ priority is configurable.
  4. The upstream repository may or may not have an external connection which means upstream repositories are connected to the public registry of the package managers.
  5. If the package is not available in any of the private repositories then it will fetch them from the public registry if the same has been defined as upstream.
  6. The package manager throws a “not found” error in case the Artifact repository does not have any upstream repository connected to it and the package is not available or stored in the Artifact repository.
  7. CodeArtifact can be utilized only after creating a session that has a default duration of 12 hours.
  8. A custom package using can also be published to an Artifact repository and made available within the Domain.
  9. CodeArtifact is well integrated with EventBridge (formerly CloudWatch Events) which enables us to extend workflows to trigger CodeBuild, CodePipeline, and other native AWS services.
  10. CodeArtifact since launch also has support from other AWS Services:
  • CodeArtifact is also covered by CloudTrail — for audit insights in real-time.
  • Well integrated with KMS which gives easy to use encryption mechanism via AWS-provided or customer-managed keys.
  • CloudFormation — to roll out CodeArtifact repositories and domains using the IaC feature.

Our Current Workflow

As shown in the above image:

  1. The source code is managed in the CodeCommit Repository.
  2. At the time of deployment, the code is fetched from the CodeCommit repository, and then the build process triggers.
  3. During the build process, it downloads all the required packages from the Public Registry, once downloaded it creates an archive of the build artifact and deploys it to the server/lambda wherever is required.

Problem Statement

The major issue which we face here is that we do not have any control over the modules or the packages which were required to build the application and were downloaded from the public registry on the fly. This poses multifold risks as follows:

  1. The package could be no longer maintained or removed from the Public Registry.
  2. The mentioned package version is removed from the Public Registry.
  3. No control over the authenticity of publicly available packages.
  4. No process to check code quality, duplicity whenever a new package is used from Public Registry.

All this leads to either breaking the build process and if the application gets deployed succesfully, there are chances that unregulated public packages might break the application functionality in general.

Proposed Workflow (Integrating AWS Code Artifact)

In the below diagram, the major change is in the part from where we fetch the software packages.

New Proposed Workflow
  1. We decided to have our private package repository / Artifact store which will save all the required packages in a single place.
  2. With this we ensured that if the package is removed/upgraded from Public Repository the package version which we need would still be available to us in a private repository.
  3. Our Private Domain in CodeArtifact will contain only verified software packages.

Changes required in Current Process

We have introduced a new role called Artifact Manager who is now responsible for managing all the required packages in Private Repository with complete control. The changes would be required in the process of how we fetch and maintain our Packages and also the related deployment process. With the introduction of CodeArtifact, our process is now two-step (as mentioned below) and a package has to go through this cycle before it can become part of our repository:

Step 1: Publish packages to CodeArtifact repository.

Publishing to the Private Repository will become a responsibility of Artifact Managers who would be responsible for the authenticity, non-duplicity, and availability of the software packages in the private repositories.

Step 2: Consume private packages from CodeArtifact during deployment process.

The changes in the deployment process are summed up as below:

  • For NodeJS packages — using the Code Artifact repository as the sole package manager.
  • IAM roles to be updated such that the required resources can access the code artifact repository.

--

--