What's a recrawl?
In a recrawl, we dispatch our web crawler to article URLs it’s already visited in order to update the metadata associated with those records. This is useful in situations where you’ve configured your metadata incorrectly, for example, or when you’ve decided to add richer metadata retroactively that you’d like to see updated in your dashboard and API results.
Note that a recrawl will only correct data going forward. If you would like to correct historical data a rebuild of your historical data will be necessary. More information on rebuilds is available here: https://www.parse.ly/help/post/4339/whats-a-rebuild/
Administrators can recrawl individual pages:
You also submit individual URLs to be recrawled if you’re an admin for your site. Do so from dash.parsely.com/(your API key)/settings/api/
Trigger a new article crawl from the Parsely API
You are also able to trigger an article recrawl through the Parsely API. If your CMS supports webhooks, you may be able to automate this process as well. More details and endpoint documentation are available here: https://www.parse.ly/help/integration/trigger-crawl.
Contact firstname.lastname@example.org if you need to set up a bulk recrawl.
Note: If the article you are recrawling is less than 5 days old, the historical data will automatically be fixed in an automated process called a ‘rebuild’. If the article is older than 5 days old, contact email@example.com to manually rebuild the data.