::install_github("bhelsel/agcounts") devtools
Introduction
Neishabouri et al. released a preprint of “Quantification of Acceleration as Activity Counts in ActiGraph Wearables” on February 24, 2022 and the Python code on Github. Like many others, I thought this package had the potential to be useful when analyzing accelerometer data. It makes known the proprietary algorithm that ActiGraph uses to generate counts, but it also allows the easy conversion of raw data to counts from any accelerometer file.
I had started using Python again when I first started my postdoc at The University of Kansas Medical Center (KUMC). SaS and Stata were languages that I was taught during my PhD at Clemson University, but I lost access to the software and was looking for something that was open-source and free. I had colleagues in Bioinformatics who used Python at Clemson and I spent about a year learning some Python basics. After advancing my knowledge of Python during my postdoc at KUMC, I decided to start learning R since I knew it was common among academics. The R programming language was very intuitive after using Python and I was able to pick it up quickly. I also discovered the power of creating functions and packages. Most of the functions and packages I’ve created haven’t been shared. However, I thought translating the agcounts package from Python to R would be a fun and useful project for me to learn more about package development and further advance my knowledge in both programming languages.
Dr. Kimberly Clevenger recently wrote a blog post comparing the initial release of the agcounts R package to counts generated by Actilife. She found that the files were similar with differences mostly due to rounding, but suggests more testing with free-living data. She also modifies the code to work with GT3X, CSV, and binary files and adds her modification to the post for users to see. The second part of Dr. Clevenger’s post inspired one of the recent changes to the agcounts R package. We added the calculate_counts
function to create a more flexible programming experience for users that may be using different file types.
Purpose
The purpose of this post is to provide an overview of the calculate_counts
and get_counts
functions. These are the two exported functions within the agcounts package.
Install agcounts v0.3.0
Before we get started, make sure you have the latest version of the agcounts package installed.
We are creating package releases on GitHub, so you can always re-install the first version of the package if it works better for you.
::install_github("bhelsel/agcounts", ref = "v0.1.0") devtools
Load the agcounts package
library(agcounts)
Let’s also add the path name to the example data set that comes with the agcounts package. You can access it by typing the following into your R Script.
<- system.file("extdata/example.gt3x", package = "agcounts") pathname
This only adds the path name to the value pathname
in your Global Environment. Now we can either pass the pathname
value to the get_counts
function or we can read in the data set to R first and then pass the raw data to the calculate_counts
function.
get_counts
<- get_counts(path = pathname, epoch = 60, write.file = FALSE, return.data = TRUE) data
Python module "pygt3x" is not found. Switching parser to GGIR.
Loading chunk: 1
There is not enough data to perform the GGIR autocalibration method. Returning data as read by read.gt3x.
head(data, 5)
time Axis1 Axis2 Axis3 Vector.Magnitude
1 2023-06-13 08:34:00 2606 3116 3542 5389
2 2023-06-13 08:35:00 1738 3943 2840 5161
3 2023-06-13 08:36:00 2172 3363 2646 4799
You will notice that the arguments write.file
is set to FALSE and return.data
is set to TRUE. These are the default values in agcounts v0.3.0. This way the program doesn’t make changes to your computer without you changing the write.file
argument to TRUE. You will also notice that the frequency
argument is missing. We added automated frequency detection so it is one less argument that the user needs to supply to the get_counts
function. This is done through the .get_frequency
internal function, which can be viewed by typing View(agcounts:::get_frequency)
. The one requirement for .get_frequency
to autodetect the sample rate is that the data set have a time
variable. This is not a problem with the get_counts
function because we use read.gt3x::read.gt3x
inside of get_counts
and that provides the formatting needed. If you were using calculate_counts
, then you would want to make sure that the variable names are set to time
, X
, Y
, and Z
.
calculate_counts
calculate_counts
is called within the get_counts
function. However, if you want to use the calculate_counts
function then you need to first import the data. We can import the sample data into R using the read.gt3x::read.gt3x
function, but data can be imported into R as a CSV, binary, or other file as in Dr. Clevenger’s blog post.
library(read.gt3x)
<- read.gt3x(path = pathname, asDataFrame = TRUE, imputeZeroes = TRUE)
raw head(raw, 5)
Sampling Rate: 100Hz
Firmware Version: 1.7.2
Serial Number Prefix: TAS
time X Y Z
1 2023-06-13 08:34:00 -0.020 0.008 1.031
2 2023-06-13 08:34:00 0.000 0.004 1.035
3 2023-06-13 08:34:00 0.008 0.004 1.031
4 2023-06-13 08:34:00 0.008 0.004 1.035
5 2023-06-13 08:34:00 0.008 0.004 1.031
As you can see, the read.gt3x
function already returns the data in the format that we need. If the variable names are not the same, you may need to change the names first before using calculate_counts
. This can be done within a piping workflow using the stats::setNames
function. The columns can be in any order since calculate_counts
refers to the columns by name, but you can use dplyr::relocate
if you would like to change the order of the columns. Finally, we need start_time
and stop_time
attributes in our data frame. You can either set the attributes on a separate line of code using attr(raw, "start_time") <- raw$time[1]
and attr(raw, "stop_time") <- round(raw$time[nrow(raw)])
. However, if you prefer the piping workflow this can also be done using the code below.
library(dplyr)
%>%
raw ::setNames(c("time", "X", "Y", "Z")) %>%
statsrelocate(c("time", "X", "Y", "Z")) %>%
`attr<-`(., "start_time", .$time[1]) %>%
`attr<-`(., "stop_time", round(.$time[nrow(.)])) %>%
calculate_counts(epoch = 60)
time Axis1 Axis2 Axis3 Vector.Magnitude
1 2023-06-13 08:34:00 2606 3116 3542 5389
2 2023-06-13 08:35:00 1738 3943 2840 5161
3 2023-06-13 08:36:00 2172 3363 2646 4799
Conclusion
It’s a fairly simple package, but I hope that you will find it useful within your workflow. If you have any problems, feel free to comment below or add an issue on the agcounts GitHub page.