Sitemap
What Is a Sitemap?
A sitemap is a file that helps search engines such as Google and Naver comprehensively index a site. Basically, it can be thought of as a file that lists URLs. Bots crawl the site based on this file.
Content types and update frequency can be specified, but the most important thing is where sitemap.xml is located. Since only URLs under the domain below sitemap.xml are crawled, the installation location must be chosen carefully. In general, it is best to place it at the root.
Sitemap XML Format
<?xml version="1.0" encoding="UTF-8"?>
<urlset>
<url>
<loc>https://www.devkuma.com/docs/java/static/</loc>
<lastmod>2022-04-03T20:41:00+09:00</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
</url>
</urlset>
| Tag | Required/Optional | Description |
|---|---|---|
<urlset> |
Required | The tag that wraps the whole document and refers to the current protocol standard. |
<url> |
Required | |
<loc> |
Required | Page URL. It must include a trailing slash and the value must be 2,048 characters or fewer. |
<lastmod> |
Optional | Last update date of the file. |
<changefreq> |
Optional | Page update frequency. |
<priority> |
Optional | URL priority. A value from 0.0 to 1.0 can be specified. The default is 0.5. Do not set high priority for every URL on the site. |
Page update frequency (changefreq) list:
- always: contents are updated every time the page is accessed
- hourly: once per hour or less
- daily: at least once per day
- weekly: at least once per week
- monthly: at least once per month
- yearly: at least once per year
- never: crawled periodically, for pages that do not need updates
When Using Multiple Sitemap Files
If there are 50,000 or more URLs, multiple sitemaps are needed. In that case, create a sitemap index file and tell crawlers that multiple sitemaps exist.
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://www.devkuma.com/sitemap1.xml.gz</loc>
<lastmod>2022-12-06T01:57:17+09:00</lastmod>
</sitemap>
<sitemap>
<loc>https://www.devkuma.com/sitemap2.xml.gz</loc>
<lastmod>2021-01-01</lastmod>
</sitemap>
</sitemapindex>
| Tag | Required/Optional | Description |
|---|---|---|
<loc> |
Required | Sitemap file name |
<lastmod> |
Optional | Last update date of the file |