Providing metadata

Specifying metadata information on your webpages is the second step of integrating with Parse.ly. (Jump to documentation about installing the tracking code if you don't have the tracker running yet!)

You can provide metadata in two different ways:

  • JSON-LD (recommended) - following open standards and schemas, metadata is included in a script tag. This metadata can also be used by other services (such as Google, for enhanced display in search listings).
  • repeated metatags (alternative) - if the CMS you are using has a way to provide page information as meta tags in the page header, then this might be a more convenient option. Metadata specified this way will only be used by Parse.ly.

Note:

Parse.ly's crawlers don't execute JavaScript, so regardless of which metadata format you choose, the information must be accessible in the actual source of the page. For more, check out our detailed crawler information.

About the JSON-LD tag

A json-ld script tag uses the JSON format to provide structured, standardized, and machine-readable information about a webpage, such as its author, publication date, title, and the section it belongs to. You may already have existing json-ld tags on your pages that you can modify to include the additional properties that Parse.ly requires.

If not, add a tag like the following example to the <head> element of tracked pages. The body of the tag should be properly formatted JSON. To understand how to customize the values for your site, continue to the detailed descriptions of each property below.

Example

<script type="application/ld+json">
  {
    "@context": "http://schema.org",
    "@type": "NewsArticle",
    "headline": "Zipf's Law of the Internet: Explaining Online Behavior",
    "url": "https://blog.parse.ly/post/57821746552",
    "thumbnailUrl": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
    "datePublished": "2013-08-15T13:00:00Z",
    "articleSection": "Programming",
    "creator": ["Alan Alexander Milne"],
    "keywords": ["statistics","zipf","internet","behavior"]
  }
</script>

Explanation of required properties

@context The collection where the schema is defined. Always http://schema.org.
@type The specific schema being used. For posts, we generally recommend NewsArticle. For non-post pages, use WebPage. For an explanation of the difference between the two, and additional alternatives, see the section on distinguishing between "posts" and "pages" below.
headline Post or page title (article headline).
url Canonical URL for post or page. For page groups, like galleries, it should always point to the main page. For accurate data, canonical urls specified in other metadata tags (such as <link rel="canonical"> and <meta property="og:url"> tags) must match or resolve to this url. For more information, please refer to our documentation on shares integration.
thumbnailUrl URL of the image associated with the post or page.
datePublished Publication date, formatted as an ISO 8601 UTC timezone string.
articleSection Section the page belongs to (e.g. "Politics").
creator Author of the post, provided either as a string or, for the multi-author posts, as a list.
keywords A list of tags associated with this post.

If some of these fields don't make sense for a particular page, consider whether it's better tracked as a page instead of a post.

Technical Caveats

  • Escape double quotes in JSON item values. All double quotes in text should be escaped with a backslash symbol like this: \". For example, "headline": "Governor Claims Veto was \"Necessary\" During Summit".
  • Values in json-ld will appear literally inside Parse.ly Analytics. String values supplied here, specifically headline, creator, and articleSection, will appear in Parse.ly analytics exactly as they are specified in the tag. As a result, make sure to use proper capitalization and specify the values as you expect them to appear.
  • The json-ld script tag cannot be loaded asynchronously. The Parse.ly crawler will not execute JavaScript. It must be able to access the metadata tag from the results of a single GET request.

Note: standards compliance

All the properties above come from the schema.org NewsArticle schema, making the example JSON-LD tag fully standards-compliant. To keep integration as simple as possible, we've included only the properties that the Parse.ly crawler actually uses. But there are many other valid schema properties you may also choose to include, and that other services recommend or require. Scroll down to the additional examples to see a json-ld tag that also includes the additional properties Google recommends.

Distinguishing between "posts" and "pages"

When collecting metadata, Parse.ly distinguishes between webpages that contain a singular piece of content (which we refer to as "posts"), and those that don't (homepages, index pages, section pages, etc.), based on the @type property specified.

@type values that Parse.ly recognizes as posts

While NewsArticle is the preferred @type value for posts, Parse.ly can also accommodate other types:

If a page contains multiple json-ld blocks with these @type values, the Parse.ly crawler will preferentially choose the type that's higher on the list. For example, if both Article and Review blocks are present on a page, we will collect the values from the Article block.

@type values that Parse.ly recognizes as non-post pages

While we expect posts to include all the properties in the main example above, not all properties may be relevant on non-post pages (see example below).

Non-post page example

<script type="application/ld+json">
  {
    "@context": "http://schema.org",
    "@type": "WebPage",
    "headline": "Category: Analytics That Matter",
    "url": "https://blog.parse.ly/post/category/analytics-that-matter/"
  }
</script>

Additional JSON-LD tag examples

With additional properties Google recommends for enhanced display in search listings

You can check your own implementation with the Google Structured Data Testing Tool.

<script type="application/ld+json">
  {
    "@context": "http://schema.org",
    "@type": "NewsArticle",
    "headline": "Zipf's Law of the Internet: Explaining Online Behavior",
    "url": "https://blog.parse.ly/post/57821746552",
    "thumbnailUrl": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
    "image": "https://blog.parse.ly/inline_mra670hTvL1qz4rgp.png",
    "dateCreated": "2013-08-10T01:25:08Z",
    "datePublished": "2013-08-10T01:25:08Z",
    "dateModified": "2013-08-10T01:25:08Z",
    "articleSection": "Programming",
    "creator": ["Alan Alexander Milne"],
    "author": ["Alan Alexander Milne"],
    "keywords": ["data", "intern", "parse.ly"],
    "mainEntityOfPage": {
      "@type": "WebPage",
      "@id": "https://blog.parse.ly/post/57821746552"
    },
    "publisher": {
      "@type": "Organization",
      "name": "Parse.ly",
      "logo": {
        "@type": "ImageObject",
        "url": "https://www.parse.ly/help/wp-content/uploads/2016/04/logo-parsely-help-254.png"
      }
    }
  }
</script>

Note that some of these properties may have overlapping values. Here is how they're resolved by our crawler:

  • Parse.ly preferentially uses datePublished, rather than dateCreated, if both are present.
  • Parse.ly uses thumbnailUrl, but not image.
  • For author, creator, and contributor properties, Parse.ly will combine all the unique values into a single list.

We would also like to echo Google's advice on structured data:

...it is more important to supply fewer but complete and accurate recommended properties rather than trying to provide every possible recommended property with less complete, badly-formed, or inaccurate data.

Do you have an urgent support question?