Timeseries Decompositions with Web Activity Data (R)
Note: to complete this example you will need to install the AvenueAPI client for R.
In this guide we demonstrate how to download, plot, and decompose web traffic data using 7Park’s “Extension” API endpoint. To replicate this example you will need a valid 7Park Data API key (contact your 7Park sales representative to request a key).
To start, we use the fetch_traffic_series function to quickly download and restructure data for google.com and YouTube, saving the JSON returns as well as standard dataframes in both “long” and “wide” formats. We can retrieve the raw JSON objects and save them as structured lists in R with the following code:
# Load the AvenueAPI client library(AvenueAPI) # Uncomment this line and store your API key here # ave_key <- 'YOUR_KEY_HERE' # This initiates an S4 Class object with associated methods for calling the API: ave <- connect_avenue(api_key=ave_key) # Retrieve data for Google and YouTube, saving the JSON returns in the global environment. goog <- fetch_traffic_series(ave, domain = 'google.com', platform = 'PC', start_date='2014-01-01', country_code = 'US') yotb <- fetch_traffic_series(ave, domain = 'youtube.com', platform = 'PC', start_date='2014-01-01', country_code = 'US')
The above commands create two structured data objects with Google and YouTube web traffic estimates (in this example, within the objects named “goog” and “yotb”). These objects are nested R lists and contain, in addition to web traffic data, information about the API call, any sucess/failure messages, and a summary of available metrics. While this information provides a helpful orientation to the scope of the data object returned by the API, long and wide data formats are typically easier to work with for quick plotting and statistical analysis. With this in mind the AvenueAPI package provides a generic function – transform_avenue_series – to convert the nested JSON returns into a long or wide dataframe. These transformations can be quickly performed with the following code:
# Transform the JSON return into long and wide formats to facilitate quick analysis: googlong <- transform_avenue_series(goog) googwide <- transform_avenue_series(goog, wide = TRUE) yotbwide <- transform_avenue_series(yotb, wide = TRUE)
The above calls return dataframes with the following format:
# Wide data head(googwide)
## date visitors_total visitors_unique domain platform ## 1 2014-01-01 1633.026 50.0885 google.com PC ## 2 2014-01-02 1847.031 53.2880 google.com PC ## 3 2014-01-03 1843.277 54.3325 google.com PC ## 4 2014-01-04 1795.500 55.0882 google.com PC ## 5 2014-01-05 1939.453 57.0284 google.com PC ## 6 2014-01-06 2194.052 59.9700 google.com PC ## country_code ## 1 US ## 2 US ## 3 US ## 4 US ## 5 US ## 6 US
# Long data head(googlong)
## date value metric domain platform country_code ## 1 2014-01-01 50.0885 visitors_unique google.com PC US ## 2 2014-01-02 53.2880 visitors_unique google.com PC US ## 3 2014-01-03 54.3325 visitors_unique google.com PC US ## 4 2014-01-04 55.0882 visitors_unique google.com PC US ## 5 2014-01-05 57.0284 visitors_unique google.com PC US ## 6 2014-01-06 59.9700 visitors_unique google.com PC US
Also note that, once transformed using transform_avenue_series, the data are properly typed:
## 'data.frame': 1304 obs. of 6 variables: ## $ date : Date, format: "2014-01-01" "2014-01-02" ... ## $ visitors_total : num 1633 1847 1843 1796 1939 ... ## $ visitors_unique: num 50.1 53.3 54.3 55.1 57 ... ## $ domain : chr "google.com" "google.com" "google.com" "google.com" ... ## $ platform : chr "PC" "PC" "PC" "PC" ... ## $ country_code : chr "US" "US" "US" "US" ...
Having downloaded US data for the YouTube and Google domains, we can now convert the wide dataframes into XTS and TS objects to facilitate futher analysis. Here, we build XTS objects for daily unique visitors and plot the series:
suppressPackageStartupMessages(library(xts)) # Google.com goog_xts <- xts(googwide$visitors_unique, order.by = googwide$date) plot(goog_xts)
# YouTube yotb_xts <- xts(yotbwide$visitors_unique, order.by = yotbwide$date) plot(yotb_xts)
In the code snippet that follows, we illustrate one approach to timeseries decomosition using the decompose function in base R. First, we create ts objects for Facebook and YouTube:
goog_ts <- ts(goog_xts, start=decimal_date(min(index(goog_xts))), frequency = 365) class(goog_ts)
##  "ts"
yotb_ts <- ts(yotb_xts, start=decimal_date(min(index(yotb_xts))), frequency = 365)
Having saved ts objects into our session for Google and YouTube, we decompose the series into component parts (trend, seasonal, and random) with a call to the decompose function of stats.
First, we plot the Google series using the default additive decomosition method, wrapping the response with it’s default plot method:
Similarly, we plot the YouTube series with the following command: