Skip to content

Visual review of MkDocs sites in GitLab MRs

This is a repost of my original article in Siemens' blog with some formatting enhancements.

Leveraging GitLab CI job artifacts and GitLab review apps enables easy visual review of MkDocs sites in merge requests with a great developer experience and no extra infrastructure.

Advanced editorial workflows involving collaborative writing, content review, and approval are first-class citizens to Docs as Code (DaC), which is the philosophy of creating documentation โ€“ and more generally, static websites โ€“ by using the same tools and workflows that are commonly used to develop code. Instead of requiring specialized content management systems, DaC uses plain-text markup, a VCS and a DevOps platform. This enables review through merge requests (MRs), and validation and deployment to a static site hosting service through CI/CD.

A widely adopted documentation tool stack at Siemens is based on MkDocs, Material for MkDocs, and GitLab. MkDocs is a popular static site generator intended primarily for technical documentation. Our MkDocs theme based on Material for MkDocs provides Siemens look and feel and a comprehensive feature suite for technical writing. Finally, MkDocs sites are deployed via our company-wide self-hosted GitLab instance at code.siemens.com.

While the production deployment of an MkDocs site is served via GitLab Pages, multiple simultaneous deployments would be necessary for visual review2 of MR sites, which has been a long-standing GitLab Pages feature request. But using a creative combination of MkDocs build settings, GitLab CI job artifacts, and GitLab review apps, we can deploy MR review sites with a great developer experience and no extra infrastructure.

Primer

Let's assemble the necessary technical background first.

MkDocs

MkDocs converts Markdown into HTML pages that we can later serve, and which can be built with two page URL styles:

  1. Directory URLs (default) omit the .html file suffix, use a trailing slash, and refer implicitly to the file index.html in the respective filesystem directory containing the page markup.
  2. File URLs include the .html file suffix and refer explicitly to the HTML file containing the page markup.

File URLs can be enabled by passing the --no-directory-urls CLI flag to the MkDocs build command.

MkDocs also generates internal page links as relative URL paths, so the same site build can be served under arbitrary URL path prefixes unless the site's canonical URL is configured through the site_url setting.

GitLab

GitLab Pages is GitLab's built-in static site hosting service. It supports only a single deployment per GitLab project, which means only one version of a static website (typically its main version) can be served at a time.

GitLab CI job artifacts can resemble GitLab Pages1 with any number of simultaneous deployments. When GitLab Pages is enabled, files in a CI job artifacts archive are served under a special GitLab Pages URL, enabling web browsers to render HTML files:

{protocol}://{namespace}.{domain}/-/{project_path}/-/jobs/{job_id}/artifacts/*
Attribute Description
{protocol} The configured GitLab Pages protocol (e.g. https).
{namespace} The root namespace (username or top-level group) of a project.
{domain} The configured GitLab Pages domain.
{project_path} The path of the project relative to the root namespace. When the project belongs to a root namespace, {project_path} corresponds to the project's name.
{job_id} The globally unique ID of the CI job which published the artifacts archive.

Unlike regular GitLab Pages deployments, requests to directory URLs are not implicitly served with the file index.html, so explicit HTML file URLs are required.

GitLab review apps are mainly dynamic environments, each of which adds an MR widget including a "View app" button that links to the environment URL, providing a convenient shortcut to a generated or otherwise complex URL.

GitLab's predefined CI variables are injected into CI jobs, providing useful configuration and context information for creating parameterized and thus flexible CI job specifications. We will use the following predefined CI variables:

CI variable Description
$CI_COMMIT_BRANCH The name of the commit branch. This variable is only available for branch pipelines.
$CI_DEFAULT_BRANCH The name of the project's default branch, typically main or master.
$CI_JOB_ID The internal, globally unique ID of the CI job.
$CI_MERGE_REQUEST_IID The internal ID of the MR, unique for the project. This variable is only available for MR pipelines and open MRs.
$CI_PAGES_DOMAIN The GitLab Pages domain, e.g. gitlab.example.io.
$CI_PAGES_URL The complete URL of the GitLab Pages site, e.g. https://group.gitlab.example.io/project.
$CI_PROJECT_PATH The full path of the project including the namespace to which it belongs without a leading or trailing slash.
$CI_PROJECT_ROOT_NAMESPACE The root namespace (username or top-level group) of the project.

Shell

Shell scripting is an important part of the tool stack for building CI pipelines. We will use the following shell variable expansions:

Syntax Description
${var#pattern} Expands to the substring of $var from which the shortest matching prefix pattern is deleted. For instance, with var=a/b/c, ${var#*/} expands to b/c.
${var%%pattern} Expands to the substring of $var from which the longest matching suffix pattern is deleted. For instance, with var=a/b/c, ${var%%/*} expands to a.
${var+value} Expands to value if $var is set.

Solution

The CI job specification for a standard MkDocs site deployment through GitLab Pages looks like this:

.gitlab-ci.yml
pages:
  stage: deploy
  image: python:slim
  before_script:
    - pip install mkdocs
  script:
    - mkdocs build --site-dir public/
  artifacts:
    paths:
      - public/
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

The pages job installs the mkdocs package in the before_script step inside a Python container and builds the site in the public/ directory, which is declared as a CI job artifacts path by convention of GitLab Pages. The CI job rule restricts this job to running only for the default Git branch, so only that version of the site gets deployed.

The pages:review job, which builds and deploys the MkDocs site separately for each MR, extends the pages job:

pages:review:
  stage: deploy
  image: python:slim
  before_script:
    - pip install mkdocs
  script:
    - mkdocs build --site-dir public/ --no-directory-urls
  after_script:
    - echo "REVIEW_URL=${CI_PAGES_URL%%://*}://${CI_PROJECT_ROOT_NAMESPACE}.${CI_PAGES_DOMAIN}/-/${CI_PROJECT_PATH#*/}/-/jobs/${CI_JOB_ID}/artifacts/public/index.html" > review.env
  artifacts:
    paths:
      - public/
    reports:
      dotenv: review.env
  environment:
    name: review/$CI_MERGE_REQUEST_IID
    url: $REVIEW_URL
  rules:
    - if: $CI_MERGE_REQUEST_IID
-pages:
+pages:review:
   stage: deploy
   image: python:slim
   before_script:
     - pip install mkdocs
   script:
-    - mkdocs build --site-dir public/
+    - mkdocs build --site-dir public/ --no-directory-urls
+  after_script:
+    - echo "REVIEW_URL=${CI_PAGES_URL%%://*}://${CI_PROJECT_ROOT_NAMESPACE}.${CI_PAGES_DOMAIN}/-/${CI_PROJECT_PATH#*/}/-/jobs/${CI_JOB_ID}/artifacts/public/index.html" > review.env
   artifacts:
     paths:
       - public/
+    reports:
+      dotenv: review.env
+  environment:
+    name: review/$CI_MERGE_REQUEST_IID
+    url: $REVIEW_URL
  rules:
-    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
+    - if: $CI_MERGE_REQUEST_IID

Let's analyze it step by step.

With the necessity of explicit file URLs to artifacts mentioned above, the --no-directory-urls CLI flag is added to the MkDocs build command in the script step.

The GitLab Pages URL of a CI job artifacts archive โ€“ the review URL โ€“ is unpredictable across CI runs because it contains the globally unique CI job ID. But we can construct it at runtime by using predefined CI variables and expose it in an MR widget, enabled and defined by the environment keyword.

GitLab review app widget GitLab review app widget

The environment name is the MR's internal ID, $CI_MERGE_REQUEST_IID, namespaced by review/. The environment URL, which provides the link target of the "View app" button, is the constructed review URL.

Some values for constructing the review URL are not provided by dedicated predefined CI variables and need to be derived using shell variable substring expansion we introduced above.

  • The protocol (http or https) is derived by deleting the longest matching suffix pattern ://* from $CI_PAGES_URL, i.e. ${CI_PAGES_URL%%://*}.
  • The root namespace is provided by $CI_PROJECT_ROOT_NAMESPACE. Note that enabling unique domains for GitLab Pages does not affect the subdomain under which the content of a CI job artifacts archive is served.
  • The GitLab Pages domain is provided by $CI_PAGES_DOMAIN.
  • The project path relative to the root namespace is derived by deleting the shortest matching prefix pattern */ from $CI_PROJECT_PATH, i.e. ${CI_PROJECT_PATH#*/}.
  • The CI job ID is provided by $CI_JOB_ID.

Since GitLab supports only basic variable expansion, not including substring expansion, the review URL is constructed in a shell of the CI container in the after_script step and registered as a runtime-created variable, $REVIEW_URL, via a dotenv report.

The CI job rule under the rules keyword restricts this job to run only for MRs, where the predefined CI variable $CI_MERGE_REQUEST_IID is present.

Following the DRY principle, the commonalities of the pages and pages:review jobs are factored out into a template job, .docs, which they both extend by their respective needs, yielding the complete solution:

.gitlab-ci.yml
.docs:
  stage: deploy
  image: python:slim
  before_script:
    - pip install mkdocs
  script:
    - mkdocs build --site-dir public/ ${CI_MERGE_REQUEST_IID+--no-directory-urls}
  artifacts:
    paths:
      - public/

pages:review:
  extends: .docs
  after_script:
    - echo "REVIEW_URL=${CI_PAGES_URL%%://*}://${CI_PROJECT_ROOT_NAMESPACE}.${CI_PAGES_DOMAIN}/-/${CI_PROJECT_PATH#*/}/-/jobs/${CI_JOB_ID}/artifacts/public/index.html" > review.env
  artifacts:
    reports:
      dotenv: review.env
  environment:
    name: review/$CI_MERGE_REQUEST_IID
    url: $REVIEW_URL
  rules:
    - if: $CI_MERGE_REQUEST_IID

pages:
  extends: .docs
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

This is a clean refactor except for the conditional use of the --no-directory-urls CLI flag, which leaks the need for different page URL styles into the template job. Alternatively, the use_directory_urls setting could be added to mkdocs.yml and exposed as an environment variable, which the pages and pages:review jobs could set through CI job variables with their respective values. However, this approach mainly adds another level of indirection without much benefit.

Limitations

Serving review sites through CI job artifacts has a few limitations compared to using GitLab Pages:

  • GitLab Pages redirects are not supported. But even if they were, a hardcoded _redirects file would not work because (a) absolute URL paths must be specified, which differ between production deployments through GitLab Pages and review deployments using CI job artifacts, and (b) production deployments typically use directory URLs whereas review deployments must use file URLs. A partial alternative is MkDocs' official mkdocs-redirects plugin, which supports only client-side HTML redirects, applicable to pages, whereas GitLab Pages uses more flexible server-side HTTP redirects.
  • Custom domains are not supported, which is likely not a problem because review sites typically don't need to be served under a pretty domain.
  • Unique domains are not supported, which means all review sites within a root namespace share cookies. Whether or not this is a problem may need to be assessed case-by-case, although typical MkDocs sites serve only static content without managing cookies.

Bonus: Route Maps

GitLab Route Maps add MR widgets that enable navigating from changed source files, e.g. Markdown files, to their published counterparts, e.g. HTML files, enhancing the MR review experience. The mappings, defined in .gitlab/route-map.yml, look like this for a basic MkDocs site:

.gitlab/route-map.yml
- source: /docs\/(.+)\.md/
  public: '../\1.html'

- source: /docs\/(.+)/
  public: '../\1'

They are evaluated in order of their definition, so specific mappings must be defined before generic ones. The mapping from Markdown to HTML files assumes the usage of file URLs, which is necessary anyway for serving a site in a CI job artifacts archive as discussed above. Importantly, the mapping targets must begin with ../ because GitLab appends them to the review URL specified by environment:url, which ends with index.html.

Advanced sites comprising generated pages or files, e.g. through MkDocs hooks or plugins such as mkdocs-gen-files, require additional mappings.

Conclusion

Docs as Code is a modern approach to authoring and publishing technical documentation, and static websites in general, by relying on the processes and tools used to develop software. Among the abundance of technologies available is MkDocs and GitLab, which has become a popular tech stack at Siemens.

Although hosting an MkDocs site via GitLab Pages is straightforward, visual review of MR sites can be challenging, as GitLab Pages supports only a single deployment per GitLab project. But a creative combination of MkDocs build settings, GitLab CI job artifacts, and GitLab review apps makes it possible with a great developer experience and full integration into GitLab.

If you find this article useful, please consider sharing your feedback and experience via the comment box below. ๐Ÿ™


  1. Only simple static content serving is supported. See the ยง Limitations section for more details. 

  2. Visual review here refers to viewing the rendered website in a web browser and does not refer to the deprecated GitLab review apps feature "Visual Reviews". 

Comments