Skip to contents

The fableCount R package aims to offer counting time series models for users of the fable framework. These models work within the fable framework, which provides the tools to evaluate, visualize, and combine models in a workflow consistent with the tidyverse.

Installation

You can install the stable version from CRAN:

install.packages("fableCount")

You can install the development version from GitHub

# requires("devtools")
remotes::install_github("Gustavo039/fableCount")

Count Time Series

A count time series is a sequence of observations that record the number of events occurring at discrete time intervals. These events can be anything that can be counted, such as the number of daily sales, the number of calls received per hour, or the number of cases of a disease per week.

INGARCH and GLARMA usage

The package has 2 main functions.

  • INGARCH - (Integer Generalized Autoregressive Conditional Heteroskedasticity)

  • GLARMA - (Generalized Linear Autoregressive Moving Averages)

The usage of the model functions follows the fable and fabletools pattern

dataset |>
  fabletools::model(
    model_name1 = INGARCH(response_variable ~ pq(AR_oder, MA_order)),
    model_name2 = GLARMA(response_variable ~ pq(AR_oder, MA_order))
    )

If the pq() is ommited, the automatic parameter selection algorithm is triggered. Such algorithms are based on searching for the model that presents the lowest AIC or BIC

Example - Influeza in Germany

The following dataset was taken from the tscount package and gives the weekly number of reported influenza cases in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013.

(The cleaned tsibble object can be obtained via fableCount::influenza_rhine)

influenza_rhine |>
  autoplot() +
  labs(title = "Influenza Cases in Rhine-Westphalia, Germany",
       y="Number of Cases") +
  theme_minimal()

For models estimation, the automatic parameter selection method was used

model_influenza = influenza_rhine |>
  model(ing = INGARCH(cases),
        gla = GLARMA(cases, method = 'NR'))

The estimated models were:

  • INGARCH
model_influenza |> 
  select(ing) |>
  report()
#> Series: cases 
#> Model: INGARCH(2, 0) 
#> 
#> poisson INGARCH(2, 0) w/ identity link
#> # A tibble: 2 × 4
#>   statistic `(Intercept)`  beta_1   beta_2
#>   <chr>             <dbl>   <dbl>    <dbl>
#> 1 Estimate         0.202  0.986   1.04e-10
#> 2 Std.Error        0.0245 0.00864 7.29e- 3
#> 
#> log likelihood=-10521.42
#> AIC=21048.83
#> BIC=21062.24
#> QIC=21049.86
  • GLARMA
model_influenza |> 
  select(gla) |>
  report()
#> Series: cases 
#> Model: GLARMA(1, 0) 
#> 
#> Poisson GLARMA(1, 0)
#> # A tibble: 2 × 3
#>   statistic intercept      ar_1
#>   <chr>         <dbl>     <dbl>
#> 1 estimate    3.92    0.0154   
#> 2 std_error   0.00608 0.0000291
#> 
#> log likelihood=-59377
#> AIC=118758

With the models already estimated, it is possible to draw a prediction interval

  • INGARCH forecast
model_influenza |> 
  dplyr::select(ing) |>
  forecast(h = 5) |>
  autoplot(influenza_rhine |>
             dplyr::filter(year_week > tsibble::make_yearweek(2013, 5) )
           )

  • GLARMA forecast
model_influenza |> 
  dplyr::select(gla) |>
  forecast(h = 5) |>
  autoplot(influenza_rhine |>
             dplyr::filter(year_week > tsibble::make_yearweek(2013, 5) )
           )

Learning to forecast with fable

  • The forecasting principles and practices online textbook provides an introduction to time series forecasting using fable: https://otexts.com/fpp3/