Evolving Parse.ly’s API

Our company is well-known for our real-time and historical content analytics dashboard, but Parse.ly’s data is also widely used for automatic content optimization on some of the top sites across the web, via our rich API. This post covers how we’re changing the API in early 2016.

Summary for customers

  • The API is changing in a way that will not break your existing calls.
  • The migration will happen smoothly and without service interruption.
  • The new API is already serving some production traffic via the /realtime endpoint.
  • Some unused API endpoints are being removed altogether in 2016.
  • Some API parameters and response values are being deprecated in 2016.
  • The API is being made more consistent throughout, for a better developer experience.

What does the Parse.ly API do?

Some of the common uses of Parse.ly data include:

  • trending content modules on web homepages and article sidebars
  • direct integration of Parse.ly data with a home-grown content management system (CMS)
  • nightly exports of data, perhaps to produce reports for advertisers or editorial
  • article-to-article content recommendations to optimize reader re-circulation
  • personalized content recommendations to build better relationships with loyal readers
  • integration with backend ad serving systems and predictive models

What’s the scale of the API?

We did some recent calculations and learned that about 40% of Parse.ly’s customers make use of our API. We also learned that we get nearly 2 billion API calls per month from our customers, and that in the last 18 months we’ve seen over 20 billion API calls altogether. Not bad for something that started as an experiment only 2 years ago.

With our dashboard relaunch this year, we have been making an effort to transition our API servers over to our new Mage-powered analytics backend. Though, for the most part, this transition will not affect existing API users, in the process of moving the API over, we wanted to clean up some experimental endpoints, fix consistency issues, and also deprecate endpoints that are not widely used.

So, what’s changing?

Date Filtering

There are some standard query parameters that the API supports for filtering down the data you export, especially in our analytics data. For example, posts can be filtered by traffic period (period_start, period_end, days) or by publication date (pub_date_start, pub_date_end, pub_days).

However, we noticed that not every endpoint that returns data based on posts supported all of these query parameters. This was due to some quirks of our old backend, but the new backend offers consistent date-based filtering.

So, when the new Mage-powered API goes live in 2016, all endpoints that supported date filtering will support all of these parameters.

Author/Section/Tag Filtering

Certain endpoints supported section filtering, certain others supported only section and tag, but not author. Again, these were quirks of the old backend. The new API will support author, section, and tag filtering everywhere that post-based filtering is supported, which includes most of our real-time and historical data API endpoints.

What’s more, in our old API, the realtime endpoints were especially limited in filtering options, and it was widely requested for us to improve the filtering options here, but we couldn’t due to limitations of the old backend. I’m happy to say that a full set of filtering options is now available for real-time data!

Better Match for Dashboard Data

Our new dashboard launched in early 2015, but our API was still powered by our old backend. This meant that customers often came across confusing differences between our dashboard and our API.

The numbers reported by the new API will be a closer match to those reported in our dashboard. This will be especially true around real-time data and shares data.

That said, the API will not exactly match numbers in our dashboard. This is because the dashboard (and the reports/exports subsystem in the dashboard) always makes a strong effort to get “the most exact counts possible”, whereas the API occasionally accepts a small error rate (1-2% typically) in the name of performance/speed. If you are interested in programmatically exporting exact counts of all your data, you should explore our raw data export options by reaching out to support@parsely.com.

Changing the behavior of all-time post/URL lookups

Our API had one endpoint which returned “all-time data” when queried without any parameters. This was the /analytics/post/detail endpoint. It didn’t just return page views in the _hits field, but it also returned an estimated visitor count over all time.

We are changing the default behavior for this API call to return data from the past 90 days, instead. This was required due to the design of our new system, which no longer stores all-time counts for posts — instead, every statistic is calculated directly from all the historical data, within your account retention limits.

The good news is, it is a rare post that has traffic for longer than 90 days. Therefore, for 95% of posts, this endpoint will return the same value for our users in the _hits field. But, for evergreen posts, you’ll have to explicitly query for more data.

Does that mean you should preemptively change all your API calls to days=365 or days=730 to get around the query limit? No — if you do that, we may rate limit your account, since these queries are very expensive for our API to calculate. If you are looking to do bulk exports of your analytics data, we ask that you reach out to support@parsely.com to get access to our new raw data export options.

Deprecating the visitors field

The other problematic part of this endpoint is that it returns data in the visitors field, whether API users ask for it or not. For similar reasons as above, our new API cannot return an all-time visitor count easily within the tight response times (e.g. <1s) needed for on-site integrations. Therefore, this value will also be limited to 90 days by default in the new release of our API.

We are also deprecating the visitor key in this release and deprecating sort options in our API that specify visitor as the primary sort. That is, we will remove these options from our documentation and discourage users from making use of it, and may remove the data in a future release, probably once it becomes available in a new API.

We are going to reserve multi-metric support and multi-metric sorting for a future release of our API, which will add support for all of our metrics — that is, not just views, but also engaged time, segmented views, and some visitor-based data, as well. But that will come later and will be driven by customer feedback, like everything we do here at Parse.ly!

Removing the /editorial_overrides endpoint

The /editorial_overrides endpoint was an experiment in providing more control of our content recommendations. It ended up not being very popular with customers because it was a bit tricky to set up and manage.

We recently completed a partnership with Optimizely for headline testing, and we believe editorial overrides can now be better satisfied through Optimizely-based controls of our recommendation API results, and through CMS tools built by our agency partners. We have decided that in the interest of focusing on what we do best, we’ll remove this endpoint soon.

Removing API click tracking (aka clicks.parsely.com)

This was an experiment that simply failed. It was not widely used, and the methodology was flawed, so we’ll simply be removing support for this in the 2016 release of the API.

This is again something we can help you instrument on your site via our agency partners. Note that though we will keep the clicks.parsely.com domain alive indefinitely, we will stop rewriting links in the API to use it.

How will I know if I’m impacted by these changes?

Parse.ly keeps all access logs for its APIs, and our testing methodology is to systematically test new builds of our API over all historical calls that have been made by customers. As a result, if we discover that one or more of the changes we are making here will impact a customer account in any way, we will have our support team reach out to our admin contact for that account to warn about the changes.

We will also start to monitor for use of deprecated endpoints and warn about those to customers, so that they are prepared for when those endpoints go away in 2016.

Conclusion

Parse.ly runs one of the widest-used content analytics API in the world, and we are happy to be powering automatic content optimization for many of our customers. The changes described in this post will make the API more consistent and allow us to focus on the core of what we do best, while setting a strong foundation for us to deliver even better APIs in the future.

If you have any questions, please reach out directly to @ParselySupport or support@parsely.com!