Dynamically Generate a Sitemap for your Elixir Markdown Blog

Posted on Jun 15, 2022 by Michael

ELIXIR MARKDOWN

A sitemap of the top level structure of the Cassava website

This article is a quick addendum to our inaugural post announcing the Cassava blog - Roll Your Own Blog in Elixir and Phoenix. Our blog does not currently have a Sitemap to instruct search engines on how best to index our articles.

Let’s quickly address this oversight.

Several Hex packages are readily available to generate a Sitemap on your behalf. A quick inventory of the more popular options shows they are well suited to render a Sitemap to disk or to an S3 bucket.

However, I need to generate the Sitemap dynamically to reflect the side-loaded nature in which our articles are added to the running instance of Cassava.

Rather than fighting against the grain, let’s just quickly roll a solution as the process is straightforward.

First, add a new route to your Endpoint.

get("/sitemap.xml", BlogController, :sitemap)

Add an action to your web controller taking note of the content type required by the response.

def sitemap(conn, _params),
  do:
    conn
    |> put_resp_content_type("application/xml")
    |> send_resp(
      :ok,
      Sitemap.generate()
    )

Define a module to generate the XML. This task is accomplished by combining the publicly accessible static portions of the site with the dynamic blog content.

defmodule CassavaWeb.Sitemap do
  def generate do
    host = "https://gocassava.com"

    """
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    #{public_nodes(host)}
    #{blog_nodes(host)}
    </urlset>
    """
  end

  defp public_nodes(base_url) do
    public_pages = [
      "",
      "slack",
      "pricing",
      "privacy",
      "terms",
      "about",
      "admin/register",
      "admin/sign-in",
      "admin/contact"
    ]

    public_pages
    |> Enum.map(
      &"""
      <url>
      <loc>#{base_url}/#{&1}</loc>
      <lastmod>#{@last_modified}</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.6</priority>
      </url>
      """
    )
    |> Enum.join("")
  end

  defp blog_nodes(base_url),
    do:
      fetch_sitemap_details()
      |> Enum.map(
        &"""
        <url>
        <loc>#{base_url}/blog/#{&1.slug}</loc>
        <lastmod>#{to_iso(&1.updated_at)}</lastmod>
        <changefreq>daily</changefreq>
        <priority>0.6</priority>
        </url>
        """
      )
      |> Enum.join("")

  defp to_iso(dte),
    do:
      dte
      |> Timex.Timezone.convert("America/New_York")
      |> Timex.format("%Y-%m-%dT%H:%M:%S%:z", :strftime)
      |> elem(1)
end

The fetch executes an Ecto Query, which returns a list of articles detailing their slug and last modified timestamp.

defp fetch_sitemap_details() do
  threshold = DateTime.utc_now() |> DateTime.truncate(:second)

  Repo.all(
    Post
    |> where([p], p.published_date < ^threshold)
    |> order_by(desc: :updated_at)
    |> select([p], map(p, [:slug, :updated_at]))
  )
end

With these precursor steps in place, all that remains is to call the /sitemap.xml route and return a well-formed document.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://gocassava.com/</loc>
    <lastmod>2022-06-15T09:00:00-04:00</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.6</priority>
  </url>
  <url>
    <loc>https://gocassava.com/slack</loc>
    <lastmod>2022-06-15T09:00:00-04:00</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.6</priority>
  </url>
  ...
  <url>
    <loc>https://gocassava.com/blog/roll-your-own-blog-in-elixir-phoenix</loc>
    <lastmod>2022-06-15T09:23:10-04:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.6</priority>
  </url>
  <url>
  ...
</urlset>

Take your freshly minted Sitemap over to Google and Bing for registration, and you are done.

Google Search Console

Bing Submission

Easy peasy.

With the crisis averted, we can return to our regularly scheduled programming.

Your feedback is welcomed as always.

Michael


You no doubt have an opinion bubbling to the surface.
Let's go one step farther and add your voice to the conversation.
Your email is used to display your Gravatar and is never disclosed. As always, do review our moderation guidelines to keep the converstion friendly and respectful.


Placeholder image

John Smith @johnsmith 31m
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ornare magna eros, eu pellentesque tortor sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit vestibulum ut. Maecenas non massa sem. Etiam finibus odio quis feugiat facilisis.

Placeholder image

John Smith @johnsmith 31m
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ornare magna eros, eu pellentesque tortor sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit vestibulum ut. Maecenas non massa sem. Etiam finibus odio quis feugiat facilisis.