WordPress Archive Pages
WordPress automatically creates archive pages, typically based on taxonomies, categories, and dates. A list of archive pages that may be created on your site are below:
- Category archives
- Tag archives
- Author Archives
- Date archives
- Post format archives (if your WordPress theme supports it)
- Search archives (search results page)
By default, search engines can access each one of these pages. Archive pages can cause SEO issues and harm your rankings for a couple of reasons.
- Duplicate Content – if showing an excerpt from an existing post(s), or a full post, on all category archive pages, this is duplicate content
- Thin Content – since the Panda updates, Google has targeted sites that are full of low quality content
- Keyword Ranking – archive pages ranking for any of the keywords you are targeting + ability to rank for the terms you are targeting
- Link Juice – decreasing potential link juice to more important pages
- Topic Indication – if you’re using date based archives, then the h1 on the page is the date e.g. “June 2011” and therefore, the “topic indicator” that category page then has to pass down to the single posts is also “June 2011.” Same goes for Author Archives. By default, their page title would be the publication date
Duplicate content means that similar content appears at multiple locations (URLs) on the web. For example, your articles about ‘keyword x’ appears at http://www.example.com/keyword-x/ and the same content also appears at http://www.example.com/article-category/keyword-x/.
Do you need them?
- Are they truly valuable to users?
- Do you actively link to them internally?
- Are they part of your users navigation?
- Do they contain unique, customised content?
Solving
Do not put these pages in robots.txt. If you block them, then Google won’t be able to see when you update or change them, but they will remain in search results with an ugly listing.
Either:
- If the pages add no value, delete them entirely and serve a 404 error status
- If the pages are important for users to navigate and are a “necessary evil” of having a blog, then they should be noindexed. This means that your archives will be accessible but you will say to Google not to index those pages. Note: you should noindex via robots meta tags, not in robots.txt
Recommendation
Whilst exceptions apply, the following recommendations apply to 90%+ of all (non e-commerce) WordPress sites:
- Choose to either use Category Archives or Tag Archives from an SEO and navigation perspective, and only allowing index of this ONE archive
- Name your categories based on the words you’d like them to rank for
- Limit your primary category to a maximum of 8-10 items to avoid thin content
- Any other categories or tags should be noindexed
- Do not link to date based archives
- Create custom content for the top of all your category based archives
- If none of your category or tag archive pages add value, go into Yoast or your SEO plugin and set the entire taxonomy type to noindex
Custom Post Types
Archives will be created if your CPT contains ‘has_archive’ => true while registering the custom post types https://codex.wordpress.org/Function_Reference/register_post_type
When you have the ‘has_archive’ => true, check under SEO > Titles & Metas > Post Types > Custom Post Type Archives > your Custom Post Type – you can enter Title and Meta Description of that specific custom post type.
SEO > Search Appearance > Content Types
Yoast and other multifunction SEO plugins make it simple to add noindex meta directives to your category and tag archive pages. Sometimes you don’t want search engines crawling and indexing your category and tag archive pages. Especially those with thin or duplicate content.
More information at https://www.searchenginejournal.com/noindex-category-other-listing-pages/255811/#close