Creating an Engineering Blog with Astro and Notion

2024-05-31

I Want to Create an Engineer Blog!

Hello, I'm Inoue, the CTO at Eukarya Inc. We at Eukarya have decided to start an engineer blog!

The first thing to consider when launching an engineer blog is what tools to use for writing articles and how to publish the blog.

At Eukarya, after considering various tools, including cloud services, we decided to write articles using Notion and convert the content into HTML to distribute as static pages.

We chose Notion because it is already widely used within the company, offers a great writing experience for engineers, and keeps information from spreading across multiple external services.

When we researched how to publish Notion content as a static blog, we found a tool called astro-notion-blog. As the name suggests, it uses Astro, a static page generator, to create a blog from Notion but due to the following reasons we decided not to adopt it directly:

Design: We wanted to create a blog with a simpler and more stylish design, like Medium.
ESLint and TypeScript Errors: When trying to customize, there were many ESLint and TypeScript errors, resulting in a poor development experience.
Complex Rendering Implementation: Much of the rendering from Notion to HTML was custom implemented, requiring a lot of work to add new features.

Therefore, while referring to astro-notion-blog, We decided to implement a new tool myself. This resulted in Astrotion (https://github.com/rot1024/astrotion). Like astro-notion-blog, this repository is forked for use.

Features of Astrotion

Focusing on the differences from astro-notion-blog, Astrotion has the following features:

Based on the Astro theme Creek (uses Tailwind)
Implemented in 100% TypeScript
Customizable settings and Astro files are separated for easy customization
Converts Notion content to Markdown before rendering, making it easier to customize rendering
Notion cache works even with Cloudflare Pages builds
Automatic OGP generation feature

Astrotion is a newly created tool and has not yet been used in production. We plan to run it in production with this blog and make adjustments as we go, but it currently meets our initial requirements.

If you have any requests, please open issues or PRs, and I'll review them when I have time!

Below, I'll explain how we implemented the key features.

Running Custom Processes at Build Time with Astro

To run custom processes during build time in Astro, such as fetching information from Notion and reflecting it in the page content, the simplest way is to implement the process directly in the .astro files.

Blogs generally have two types of pages: a list of multiple articles and individual article pages. In Astro, you can create files like index.astro and [slug].astro to build each of these pages.

Here is an excerpt from the index.astro file for the article list page:

Fetch metadata from the Notion database and all pages within the database. This includes data like the blog title, each article's title, cover images, creation dates, etc.
Use the paginate helper function to generate pagination.
Finally, use downloadImages to download the images required for this page and place them in the static directory (the URLs of the images obtained with the Notion API are expired URLs and cannot be used permanently).

---
import Pagination from "../components/Pagination.astro";
import PostList from "../components/PostList.astro";
import config from "../config";
import { downloadImages } from "../download";
import Layout from "../layouts/Layout.astro";
import client from "../notion";
import { paginate } from "../utils";

const [database, posts] = await Promise.all([
  client.getDatabase(),
  client.getAllPosts(),
]);

const { pageCount, pageInt, pagePosts } = paginate(posts, "1");
await downloadImages(database?.images, ...posts.map((p) => p.images));
---

<Layout>
  <main class="py-12 lg:py-20">
    <article class="max-w-6xl mx-auto px-3">
      <header class="text-center mb-12">
        <h1 class:list={["mb-12 text-6xl title", config.index?.titleClasses]}>
          {database.title}
        </h1>
        <p class="mx-auto max-w-xl">
          {database.description}
        </p>
      </header>
      <PostList posts={pagePosts} />
      <Pagination page={pageInt} pageCount={pageCount} />
    </article>
  </main>
</Layout>

Next is an excerpt from the [slug].astro file, which is an individual article page.

Astro uses a exported function called getStaticPaths to dynamically computes a list of slugs and then teaches them to Astro, which then performs the processing on [slug].astro to build the individual pages.

Get database metadata from Notion with client.getDatabase (used for blog titles, etc.)
Retrieve metadata from Notion's post page based on Astro.params.slug with client.getPostBySlug. This information includes page ID, title, cover image, creation date and time, etc.
Retrieve the page content from Notion with client.getPostContent since the page ID is available. At this time, the markdown process is also performed at the same time, and you can get it as a markdown.
The markdown is converted to HTML with markdownToHTML. This will be embedded in Astro's HTML.
Finally, with downloadImages, download the required images that have appeared so far and place them in the static directory (the URLs of the images obtained with the Notion API are expired URLs and cannot be used permanently).

---
import config from "../../config";
import PostFooter from "../../customization/PostFooter.astro";
import { downloadImages } from "../../download";
import Layout from "../../layouts/Layout.astro";
import { markdownToHTML } from "../../markdown";
import client from "../../notion";
import { postUrl, formatPostDate } from "../../utils";

const { slug } = Astro.params;

export async function getStaticPaths() {
  const posts = await client.getAllPosts();
  return posts.map((p) => ({ params: { slug: p.slug } })) ?? [];
}

const database = await client.getDatabase();
const post = slug ? await client.getPostBySlug(slug) : undefined;
if (!post) throw new Error(`Post not found: ${slug}`);

const content = post ? await client.getPostContent(post.id) : undefined;
const html = content ? await markdownToHTML(content.markdown) : undefined;
await downloadImages(content?.images, post?.images, database?.images);
---

<Layout
  title={post.title}
  description={post.excerpt}
  path={postUrl(post.slug)}
  ogImage={postUrl(post.slug + ".webp", Astro.site)}
>
  <main class="py-12 lg:py-20">
    <article class="max-w-5xl mx-auto px-3">
      <header class="mx-auto pb-12 lg:pb-20 max-w-3xl text-center">
        <h1 class:list={["text-5xl mb-6 title", config.post?.titleClasses]}>
          {post.title}
        </h1>
        <p class="text-center">
          {formatPostDate(post.date)}
        </p>
      </header>
      {
        post.featuredImage && (
          <img
            class="rounded-xl mx-auto aspect-video object-cover"
            style="min-width: 80%;"
            loading="lazy"
            src={post.featuredImage}
            alt={post.title}
          />
        )
      }
      <section
        class:list={[
          "max-w-3xl mx-auto py-6 lg:py-12 markdown post",
          config.post?.classes,
        ]}
        set:html={html}
      />
      <PostFooter post={post} database={database} />
    </article>
  </main>
</Layout>

By the way, you might have noticed that the Notion API calls such as client.getDatabase and client.getPostContent are executed as many times as there are pages during the Astro page build process. This means that if there are 100 articles, even with three calls per article, the Notion API would need to be called more than 300 times. This could potentially hit the Notion API rate limit and take a lot of time, which is not ideal.

Caching Articles Retrieved from Notion

To address this, a major point in Astrotion is that it caches responses from the Notion API to speed up the build process.

The client mentioned above can be called in the same way as a typical Notion library, but behind the scenes, it is optimized so that if a cache already exists, it returns the cached content without calling the Notion API. In case of a cache miss, it calls the Notion API, saves the response content to a file, and then returns the result.

This approach helps limit the number of Notion API calls even if client is called extensively!

When you actually run the Astro build, the following cache is generated:

tree node_modules/.astro/.astrotion
node_modules/.astro/.astrotion
├── notion-cache
│   ├── blocks-003fbd8d-14b3-4e8d-b189-d0425b63ce20.json
│   ├── blocks-00a7c94f-2115-48e6-9a28-f8d021eb4313.json
│   ├── blocks-04d758a4-0cf8-4485-b0de-a32c1238fae0.json
│   ├── blocks-09b21335-9d1a-4389-82cc-e776c7e70269.json
│   ├── blocks-0ceafd41-7530-4b67-ba95-91b304da459d.json
│   ├── blocks-1606a454-2ab8-4ae7-8f61-814e7ab66aee.json
│   ├── blocks-1c2a51a8-afa2-4f7f-a794-216b7d377ac2.json
│   ├── blocks-2cc76487-bca9-4aac-9596-2462ffb5058b.json
│   ├── blocks-2da23309-61bd-4c80-8de3-94149b638b45.json
│   ├── blocks-366522c9-e5d4-4e3c-af9e-4e0b4b16bb65.json
│   └── meta.json
└── static
    ├── CE8F71C3-DBE1-40AB-805C-8F17B3F5D37B.webp
    ├── D9E44F39-ADD7-4315-9A2B-9717C4B116F4.webp
    └── EC946DBE-9A4C-47BA-827D-C0181EB54B0E.webp

Looking at the above file, you can see that a file named "blocks" is cached. This is the response content for "blocks" that can be retrieved from the Notion API.

The content of a Notion page is essentially a collection of blocks. The page itself is one large block, and sometimes the content of one block depends on another block. In such cases, whenever a new block ID appears, the Notion API is called again, and the corresponding JSON file for each block ID is saved.

The next time the content of a Notion page is retrieved, it checks if there is a cache for the relevant block ID and uses the content of that cache, based on the last updated date, instead of calling the Notion API if no new changes have occurred.

Additionally, there is a file called meta.json, which is a cache file that retains the update dates of each article and the parent-child relationships of blocks. The webp files are the rendered results of OGP images, which will be discussed later.

These are stored in node_modules, allowing the cache to work with services like Netlify and Cloudflare Pages, speeding up subsequent builds.

Rendering Notion as HTML

In Astrotion, as mentioned earlier,

Convert Notion blocks to Markdown first
Then convert Markdown to HTML

This two-step rendering process allows for a clear separation of concerns and makes it possible to use the rich set of tools and libraries available on npm for each process.

For example, we use notion-to-md to convert Notion content to Markdown. Since it does not support all block types, we implemented some additional custom transformers. The ease of customization is a strong point of this library.

However, this library depends on the official Notion client implementation and makes API calls on its own, preventing the cache implementation mentioned earlier. To solve this, we implemented a wrapper for the Notion client with similar types that transparently handles caching and forcibly injected it during the initialization of notion-to-md.

For converting Markdown to HTML, we use unified and operate remark (for processing Markdown AST) and rehype (for processing HTML AST) on top of unified. With numerous remark and rehype plugins already available, we can achieve highly functional conversions just by combining them as follows.

For instance, syntax highlighting, mathematical rendering, and Mermaid rendering can all be accomplished within the unified ecosystem. Adding features like embedding bookmarks or creating callouts is also easy, making it extremely convenient.

export const md2html = unified()
  .use(remarkParse)
  .use(remarkGfm)
  .use(remarkMath)
  .use(remarkRehype, { allowDangerousHtml: true })
  .use(rehypeRaw)
  .use(rehypeKatex)
  .use(rehypeMermaid, { strategy: "pre-mermaid" })
  .use(rehypePrism) // put after mermaid
  .use(rehypeStringify);

export async function markdownToHTML(md: string): Promise<string> {
  return String(await md2html.process(md));
}

Rendering OGP Images

When sharing blog articles on social media, it would be great if an OGP image with the article title automatically appears, like on Qiita or Zenn. With Astro, this can be done quite easily!

As mentioned earlier, by creating a [slug].webp.ts file, you can build images in the same way as individual articles!

In the file, we load a base image for the OGP and use a library called ezog to render the OGP. Since the output is in PNG binary format, we use sharp to convert it to a more compressed WebP format.

import fs from "node:fs";

import type { APIRoute, GetStaticPaths } from "astro";
import { defaultFonts, generate } from "ezog";
import sharp from "sharp";

import config from "../../config";
import client from "../../notion";

const fonts = defaultFonts(700);
const ogBaseBuffer = await fs.promises
  .readFile(config.og?.baseImagePath || "public/og-base.png")
  .catch(() => null);

export interface Props {
  slug: string;
}

export const getStaticPaths: GetStaticPaths = async () => {
  const posts = await client.getAllPosts();
  return posts.map((p) => ({ params: { slug: p.slug } })) ?? [];
};

export const GET: APIRoute<Props> = async ({ params }) => {
  const post = await client.getPostBySlug(params.slug || "");
  if (!post) throw new Error("Post not found");

  const image = await generate(
    [
      ...(ogBaseBuffer
        ? [
            {
              type: "image" as const,
              buffer: ogBaseBuffer,
              x: 0,
              y: 0,
              width: 1200,
              height: 630,
            },
          ]
        : []),
      {
        type: "textBox",
        text: post.title,
        x: 60,
        y: 60,
        width: 1080,
        fontFamily: [...fonts.map((font) => font.name)],
        fontSize: 60,
        lineHeight: 80,
        align: "center",
        color: "#000",
        ...config.og?.titleStyle,
      },
    ],
    {
      width: 1200,
      height: 630,
      fonts: [...fonts],
      background: config.og?.backgroundColor || "#fff",
    },
  );

  const webp = await sharp(image).webp().toBuffer();
  return new Response(webp.buffer);
};

By embedding this URL in the header of each article, you can automatically generate OGP images like the one below!

Conclusion

In Astrotion, we have implemented various measures to efficiently achieve the features typically desired in an engineer blog. If you are interested, please try using it for your own blog. Improvement suggestions or pull requests are also welcome! (Though I might not respond immediately...)

In the future, I would like to organize Astrotion to make it a standalone library, headless, so that it can be easily integrated into any Astro project without depending on themes.

Astro can be a bit daunting with its unique .astro format, but it offers a development experience similar to JSX markup with React and TypeScript, and it allows for various processes to be run at build time, making it very convenient. The ecosystem is also rich, and since it uses Vite, the builds are fast, so I highly recommend it.

English

Eukaryaでは様々な職種で積極的にエンジニア採用を行っています！OSSにコントリビュートしていただける皆様からの応募をお待ちしております！

➔ Eukarya 採用ページ

Eukarya is hiring for various positions! We are looking forward to your application from everyone who can contribute to OSS!

➔ Eukarya Careers

Eukaryaは、Re:Earthと呼ばれるWebGISのSaaSの開発運営・研究開発を行っています。Web上で3Dを含むGIS（地図アプリの公開、データ管理、データ変換等）に関するあらゆる業務を完結できることを目指しています。ソースコードはほとんどOSSとしてGitHubで公開されています。

➔ Eukarya Webサイト / ➔ note / ➔ GitHub

Eukarya is developing and operating a WebGIS SaaS called Re:Earth. We aim to complete all GIS-related tasks including 3D (such as publishing map applications, data management, and data conversion) on the web. Most of the source code is published on GitHub as OSS.

➔ Eukarya Official Page / ➔ Medium / ➔ GitHub