Sitemap
The sitemaps protocol allows informing search engines about URLs on a website that are available for crawling. A Sitemap is an XML file that lists the URLs for a site. To retrieve the sitemap you need your customer identifier, the language and the instance. then just call https://{customer}.makaira.io/{lang}/sitemap.xml?instance={instance}
Considered documents
There are multiple rules for documents that are considered to be listed in the sitemap.
- the document must have an URL
- the document must be active
- the document type is not in
makaira-product
(variants),link
(searchable links),searchredirect
(search redirect),menu
menu_entry
(menu)
metadata:{robotIndex: noindex}
is not set (like this the elements can be hidden from the sitemap)
The XML data contains
- URL: taken from URL or prioritized
canonical_url
if set - alternative language links (href+hreflang): taken from attribute
selfLinks
- images: taken from
picture_url_main
- last modified: taken from
timestamp
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd
http://www.google.com/schemas/sitemap-image/1.1 http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd">
<url>
<loc>https://www.makaira.io/de/kunden</loc>
<xhtml:link rel="alternate" hreflang="de" href="https://www.makaira.io/de/kunden" />
<xhtml:link rel="alternate" hreflang="en" href="https://www.makaira.io/en/customer" />
<image:image>
<image:loc>https://www.makaira.io/picture/kunden.jpg</image:loc>
</image:image>
<lastmod>2022-04-27</lastmod>
</url>
<url>
....
</urlset>
Duplicate URLs are ignored - each URL is output only once in the sitemap (automatic deduplication, first come first serve)
Updated over 2 years ago