Timeseries Decompositions with Web Activity Data (R)


Note: to complete this example you will need to install the AvenueAPI client for R.

In this guide we demonstrate how to download, plot, and decompose web traffic data using 7Park’s “Extension” API endpoint. To replicate this example you will need a valid 7Park Data API key (contact your 7Park sales representative to request a key).


Pulling Web Traffic Data from the Avenue API

To start, we use the fetch_traffic_series function to quickly download and restructure data for google.com and YouTube, saving the JSON returns as well as standard dataframes in both “long” and “wide” formats. We can retrieve the raw JSON objects and save them as structured lists in R with the following code:


# Load the AvenueAPI client

# Uncomment this line and store your API key here
# ave_key <- 'YOUR_KEY_HERE' 

# This initiates an S4 Class object with associated methods for calling the API:
ave <- connect_avenue(api_key=ave_key) 

# Retrieve data for Google and YouTube, saving the JSON returns in the global environment.
goog   <- fetch_traffic_series(ave, domain = 'google.com',  platform = 'PC', start_date='2014-01-01', country_code = 'US')
yotb   <- fetch_traffic_series(ave, domain = 'youtube.com', platform = 'PC', start_date='2014-01-01', country_code = 'US')


The above commands create two structured data objects with Google and YouTube web traffic estimates (in this example, within the objects named “goog” and “yotb”). These objects are nested R lists and contain, in addition to web traffic data, information about the API call, any sucess/failure messages, and a summary of available metrics. While this information provides a helpful orientation to the scope of the data object returned by the API, long and wide data formats are typically easier to work with for quick plotting and statistical analysis. With this in mind the AvenueAPI package provides a generic function – transform_avenue_series – to convert the nested JSON returns into a long or wide dataframe. These transformations can be quickly performed with the following code:


# Transform the JSON return into long and wide formats to facilitate quick analysis:
googlong <- transform_avenue_series(goog)
googwide <- transform_avenue_series(goog, wide = TRUE) 
yotbwide <- transform_avenue_series(yotb, wide = TRUE) 


The above calls return dataframes with the following format:


# Wide data


##         date visitors_total visitors_unique     domain platform
## 1 2014-01-01       1633.026         50.0885 google.com       PC
## 2 2014-01-02       1847.031         53.2880 google.com       PC
## 3 2014-01-03       1843.277         54.3325 google.com       PC
## 4 2014-01-04       1795.500         55.0882 google.com       PC
## 5 2014-01-05       1939.453         57.0284 google.com       PC
## 6 2014-01-06       2194.052         59.9700 google.com       PC
##   country_code
## 1           US
## 2           US
## 3           US
## 4           US
## 5           US
## 6           US


# Long data


##         date   value          metric     domain platform country_code
## 1 2014-01-01 50.0885 visitors_unique google.com       PC           US
## 2 2014-01-02 53.2880 visitors_unique google.com       PC           US
## 3 2014-01-03 54.3325 visitors_unique google.com       PC           US
## 4 2014-01-04 55.0882 visitors_unique google.com       PC           US
## 5 2014-01-05 57.0284 visitors_unique google.com       PC           US
## 6 2014-01-06 59.9700 visitors_unique google.com       PC           US


Also note that, once transformed using transform_avenue_series, the data are properly typed:




## 'data.frame':    1304 obs. of  6 variables:
##  $ date           : Date, format: "2014-01-01" "2014-01-02" ...
##  $ visitors_total : num  1633 1847 1843 1796 1939 ...
##  $ visitors_unique: num  50.1 53.3 54.3 55.1 57 ...
##  $ domain         : chr  "google.com" "google.com" "google.com" "google.com" ...
##  $ platform       : chr  "PC" "PC" "PC" "PC" ...
##  $ country_code   : chr  "US" "US" "US" "US" ...


Building Time Series Objects

Having downloaded US data for the YouTube and Google domains, we can now convert the wide dataframes into XTS and TS objects to facilitate futher analysis. Here, we build XTS objects for daily unique visitors and plot the series:



# Google.com
goog_xts <- xts(googwide$visitors_unique, order.by = googwide$date)



# YouTube
yotb_xts <- xts(yotbwide$visitors_unique, order.by = yotbwide$date)



Decomposition of the Series

In the code snippet that follows, we illustrate one approach to timeseries decomosition using the decompose function in base R. First, we create ts objects for Facebook and YouTube:


goog_ts <- ts(goog_xts, start=decimal_date(min(index(goog_xts))), frequency = 365)


## [1] "ts"


yotb_ts <- ts(yotb_xts, start=decimal_date(min(index(yotb_xts))), frequency = 365)


Having saved ts objects into our session for Google and YouTube, we decompose the series into component parts (trend, seasonal, and random) with a call to the decompose function of stats.

First, we plot the Google series using the default additive decomosition method, wrapping the response with it’s default plot method:





Similarly, we plot the YouTube series with the following command: