Sitemap

The sitemaps protocol allows informing search engines about URLs on a website that are available for crawling. A Sitemap is an XML file that lists the URLs for a site. To retrieve the sitemap you need your customer identifier, the language and the instance. then just call https://{customer}.makaira.io/{lang}/sitemap.xml?instance={instance}

To exclude alternate links from your sitemap, you can add the parameter ignoreAlternateLinks=true to the URL. The complete URL format will be:
https://{customer}.makaira.io/{lang}/sitemap.xml?instance={instance}&ignoreAlternateLinks=true

Considered documents

There are multiple rules for documents that are considered to be listed in the sitemap.

the document must have an URL
the document must be active
the document type is not in
- makaira-product (variants),
- link (searchable links),
- searchredirect (search redirect),
- menu
- menu_entry (menu)
metadata:{robotIndex: noindex} is not set (like this the elements can be hidden from the sitemap)

The XML data contains

URL: taken from URL or prioritized canonical_url if set
alternative language links (href+hreflang): taken from attribute selfLinks
images: taken from picture_url_main
last modified: taken from timestamp

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"
    xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="
    http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
    http://www.google.com/schemas/sitemap-image/1.1 http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd">    
    <url>
        <loc>https://www.makaira.io/de/kunden</loc>
        <xhtml:link rel="alternate" hreflang="de" href="https://www.makaira.io/de/kunden" />
        <xhtml:link rel="alternate" hreflang="en" href="https://www.makaira.io/en/customer" />
        <image:image>
            <image:loc>https://www.makaira.io/picture/kunden.jpg</image:loc>
        </image:image>
        <lastmod>2022-04-27</lastmod>
    </url>
    <url>
        ....        
    
</urlset>

📘
Duplicate URLs are ignored - each URL is output only once in the sitemap (automatic deduplication, first come first serve)