Title: | Bayesian Modeling of Spatial Count Data |
---|---|
Description: | Provides a collection of functions for preparing data and fitting Bayesian count spatial regression models, with a specific focus on the Gamma-Count (GC) model. The GC model is well-suited for modeling dispersed count data, including under-dispersed or over-dispersed counts, or counts with equivalent dispersion, using Integrated Nested Laplace Approximations (INLA). The package includes functions for generating data from the GC model, as well as spatially correlated versions of the model. See Nadifar, Baghishani, Fallah (2023) <doi:10.1007/s13253-023-00550-5>. |
Authors: | Mahsa Nadifar [aut, cre] , Hossein Baghishani [aut] |
Maintainer: | Mahsa Nadifar <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-11-23 04:18:26 UTC |
Source: | https://github.com/mahsanst/spatgc |
Density, distribution function, quantile function and random
generation for the GC distribution with
parameters and
.
G(alpha, gamma) dGC(y, alpha, gamma) pGC(q, alpha, gamma, lower.tail = TRUE) qGC(p, alpha = 1, gamma) rGC(n = 1, alpha = 1, gamma = gamma, method = "PF")
G(alpha, gamma) dGC(y, alpha, gamma) pGC(q, alpha, gamma, lower.tail = TRUE) qGC(p, alpha = 1, gamma) rGC(n = 1, alpha = 1, gamma = gamma, method = "PF")
alpha |
the dispersion parameter of GC: default is 1; (shape parameter of waiting time variable); |
gamma |
the rate parameter of GC and waiting time variable; |
y |
a vector or matrix of observations for which the pdf needs to be computed. |
q |
a vector or matrix of quantiles for which the cdf needs to be computed. |
lower.tail |
logical; if TRUE (default), probabilities are
|
p |
a vector or matrix of probabilities for which the quantile needs to be computed. |
n |
number of random values to be generated. |
method |
Character string. The method used for generating random variables from the GC distribution in 'rGC'. Options are: - '"PF"': Based on the probability function. - '"IT"': Inverse transformation method based on the quantile function. - '"WT"': Based on the waiting times distribution. |
The GC distribution with parameters and
has the density
where
for and
which must be positive
values and
.
G
gives the G function,
pGC
gives the distribution function,
dGC
gives the density,
qGC
gives the quantile function and
rGC
generates random variables from the GC Distribution.
Winkelmann, R. (1995). Duration dependence and dispersion in count-data models. Journal of Business & Economic Statistics, 13(4):467-474. Nadifar, M., Baghishani, H., and Fallah, A. (2023). A flexible generalized poisson likelihood for spatial counts constructed by renewal theory, motivated by groundwater quality assessment. Journal of Agricultural, Biological, and Environmental Statistics, 28:726-748. Neutrosophic Sets and Systems, 22, 30-38.
# In a study, the number of disease incidence, we will calculate # the probability that the number of disease is zero with rate 1 dGC(0, alpha = 1, gamma = 1) # the probability that the disease will receive at least one pGC(q = 1, alpha = 1, gamma = 1, lower.tail = FALSE) # the probability that the disease will receive at most three pGC(q = 3, alpha = 1, gamma = 1, lower.tail = TRUE) # Calcaute the quantiles qGC(p = c(0.25, 0.5, 0.75), alpha = 1, gamma = 1) # Simulate 10 values from GC(1, 1) rGC(n = 10, alpha = 1, gamma = 1, method = "PF")
# In a study, the number of disease incidence, we will calculate # the probability that the number of disease is zero with rate 1 dGC(0, alpha = 1, gamma = 1) # the probability that the disease will receive at least one pGC(q = 1, alpha = 1, gamma = 1, lower.tail = FALSE) # the probability that the disease will receive at most three pGC(q = 3, alpha = 1, gamma = 1, lower.tail = TRUE) # Calcaute the quantiles qGC(p = c(0.25, 0.5, 0.75), alpha = 1, gamma = 1) # Simulate 10 values from GC(1, 1) rGC(n = 10, alpha = 1, gamma = 1, method = "PF")
This function fits an ICAR spatial Gamma-Count (GC) model to a given dataset using the INLA package. It constructs the formula based on the provided covariate data and ID variables, and fits the model using the specified adjacency matrix ('W') or a shapefile of the study region.
GClat(Y, ID, W = NULL, shapefile = NULL, covariate = NULL)
GClat(Y, ID, W = NULL, shapefile = NULL, covariate = NULL)
Y |
Vector of response variables (counts). |
ID |
Vector of indexes of regions (spatial units). |
W |
Optional adjacency matrix representing spatial connections between regions. If not provided, it can be generated from a shapefile using the 'shapefile' argument. |
shapefile |
Optional shapefile representing the study region. If provided, the adjacency matrix ('W') will be calculated from the shapefile. |
covariate |
Optional matrix of covariates. If not provided, the function assumes the model is intercept-only. |
An object of class "inla" representing the fitted ICAR spatial GC model. The object contains model estimates, diagnostics, and other results.
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 500, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial GC model ResultGC <- GClat(Y = Y, ID = ID, covariate = covariate, W = W) # Summary of the model fit summary(ResultGC)
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 500, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial GC model ResultGC <- GClat(Y = Y, ID = ID, covariate = covariate, W = W) # Summary of the model fit summary(ResultGC)
This function fits a negative binomial spatial model (including zero-inflated variations) to a given dataset using the INLA package. The function constructs the formula based on the provided covariate data and ID variables, and fits the model using the specified adjacency matrix ('W') or a shapefile of the study region.
NBlat( Y, ID, W = NULL, shapefile = NULL, covariate = NULL, family = c("nbinomial", "zeroinflatedbinomial0", "zeroinflatedbinomial1") )
NBlat( Y, ID, W = NULL, shapefile = NULL, covariate = NULL, family = c("nbinomial", "zeroinflatedbinomial0", "zeroinflatedbinomial1") )
Y |
Vector of response variables (counts). |
ID |
Vector of indexes of regions (spatial units). |
W |
Optional adjacency matrix representing spatial connections between regions. If not provided, it can be generated from a shapefile using the 'shapefile' argument. |
shapefile |
Optional shapefile representing the study region. If provided, the adjacency matrix ('W') will be calculated from the shapefile. |
covariate |
Optional matrix of covariates. If not provided, the function assumes the model is intercept-only. |
family |
The family of negative binomial models to use. Options are "nbinomial", "zeroinflatedbinomial0", and "zeroinflatedbinomial1". |
An object of class "inla" representing the fitted negative binomial spatial model. The object contains model estimates, diagnostics, and other results.
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, spatial = "lattice", V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial negative binomial model ResultNB <- NBlat(Y = Y, ID = ID, covariate = covariate, W = W, family = "nbinomial") # Summary of the model fit summary(ResultNB)
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, spatial = "lattice", V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial negative binomial model ResultNB <- NBlat(Y = Y, ID = ID, covariate = covariate, W = W, family = "nbinomial") # Summary of the model fit summary(ResultNB)
This function fits a Poisson spatial model (including zero-inflated and gamma Poisson variations) to a given dataset using the INLA package. It constructs the formula based on the provided covariate data and ID variables, and fits the model using the specified adjacency matrix ('W') or a shapefile of the study region.
Poislat( Y, ID, W = NULL, shapefile = NULL, covariate = NULL, family = c("gpoisson", "poisson", "zeroinflatedpoisson0", "zeroinflatedpoisson1") )
Poislat( Y, ID, W = NULL, shapefile = NULL, covariate = NULL, family = c("gpoisson", "poisson", "zeroinflatedpoisson0", "zeroinflatedpoisson1") )
Y |
Vector of response variables (counts). |
ID |
Vector of indexes of regions (spatial units). |
W |
Optional adjacency matrix representing spatial connections between regions. If not provided, it can be generated from a shapefile using the 'shapefile' argument. |
shapefile |
Optional shapefile representing the study region. If provided, the adjacency matrix ('W') will be calculated from the shapefile. |
covariate |
Optional matrix of covariates. If not provided, the function assumes the model is intercept-only. |
family |
The family of Poisson models to use. Options are "gpoisson", "poisson", "zeroinflatedpoisson0", and "zeroinflatedpoisson1". |
An object of class "inla" representing the fitted Poisson spatial model. The object contains model estimates, diagnostics, and other results.
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, spatial = "lattice", V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial Poisson model ResultPoisson <- Poislat(Y = Y, ID = ID, covariate = covariate, W = W, family = "poisson") # Summary of the model fit summary(ResultPoisson)
# Generate data from the GC spatial regression model with lattice spatial dependency W <- rAdj(500) # Generate a random adjacency matrix DDl <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), W = W, spatial = "lattice", V = 1) # Prepare the data Y <- DDl$y covariate <- DDl$covariate ID <- DDl$ID # Fit the spatial Poisson model ResultPoisson <- Poislat(Y = Y, ID = ID, covariate = covariate, W = W, family = "poisson") # Summary of the model fit summary(ResultPoisson)
This function generates a random adjacency matrix for a given number of regions.
The matrix is symmetric, and the upper triangular part (excluding the diagonal) is
randomly generated using the rbinom
function with a specified
probability. The lower triangular part is filled in by reflecting the upper triangular
part to ensure the matrix is symmetric.
rAdj(n, prob = 0.2)
rAdj(n, prob = 0.2)
n |
Integer. The number of regions, which determines the dimensions of the adjacency matrix. Must be a positive integer. |
prob |
Numeric. The probability of an edge (connection) between two regions, used as the probability for the binomial distribution. Default value is 0.2. Should be between 0 and 1 (inclusive). |
A symmetric numeric matrix of dimensions n x n
representing the
adjacency matrix. The values in the matrix are either 0 or 1, where 1 indicates
an edge (connection) between two regions.
# Generate a random adjacency matrix for 5 regions with a probability of 0.3 random_adj_matrix <- rAdj(5, prob = 0.3) print(random_adj_matrix) # Check if the matrix is symmetric all(random_adj_matrix == t(random_adj_matrix))
# Generate a random adjacency matrix for 5 regions with a probability of 0.3 random_adj_matrix <- rAdj(5, prob = 0.3) print(random_adj_matrix) # Check if the matrix is symmetric all(random_adj_matrix == t(random_adj_matrix))
This function generates spatially dependent count data based on the Gamma-Count (GC) spatial regression model. It uses a specified geospatial dependency model with parameters such as 'sigma' for variance and 'range' for spatial range. The function returns a list containing the generated data and relevant information about the simulation.
rGCgeo( n = n, alpha, beta0, beta, V = NULL, rho = 1, sigma = NULL, range = NULL )
rGCgeo( n = n, alpha, beta0, beta, V = NULL, rho = 1, sigma = NULL, range = NULL )
n |
Integer. The number of knots (or spatial units) for which the data should be generated. |
alpha |
Numeric. The dispersion parameter of the Gamma-Count model. |
beta0 |
Numeric. The intercept term for the model. |
beta |
Numeric vector. The regression coefficients (fixed effects) for the model. |
V |
Optional numeric. The variance of the spatial random effects for lattice data. |
rho |
Optional numeric. The spatial correlation coefficient for the CAR model. Default is 1. |
sigma |
Optional numeric. The variance of the spatial random effects for geospatial data with Matern covariance. |
range |
Optional numeric. The range parameter for geospatial data with Matern covariance. |
A list containing the following components:
A matrix of covariates with the specified number of knots ('n') and columns based on the length of 'beta'.
A vector of spatial random effects generated from the Matern covariance model.
A vector representing the linear predictor, calculated as the dot product of the covariates and coefficients plus the spatial effects ('phi').
A vector of mean response values calculated as the product of 'alpha' and the exponential of 'eta'.
A vector of simulated count data based on the GC model and the mean response values ('mu').
A vector of knot IDs from 1 to 'n'.
# Generate data from the GC spatial regression model with geospatial dependency data <- rGCgeo(n = 100, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), sigma = 1, range = 2) # View the generated data print(data)
# Generate data from the GC spatial regression model with geospatial dependency data <- rGCgeo(n = 100, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), sigma = 1, range = 2) # View the generated data print(data)
This function generates spatially dependent count data based on the Gamma-Count (GC) spatial regression model. It uses a specified spatial dependency model (either ICAR or CAR) and optional adjacency matrix or shapefile for spatial relationships. The function returns a list containing the generated data and relevant information about the simulation.
rGClat( n = n, alpha, beta0, beta, spatial = "ICAR", W = NULL, V = NULL, rho = 1, shapefile = NULL )
rGClat( n = n, alpha, beta0, beta, spatial = "ICAR", W = NULL, V = NULL, rho = 1, shapefile = NULL )
n |
Integer. The number of knots (or spatial units) for which the data should be generated. If a shapefile or adjacency matrix ('W') is provided, this will be determined from those inputs. |
alpha |
Numeric. The dispersion parameter of the Gamma-Count model. |
beta0 |
Numeric. The intercept term for the model. |
beta |
Numeric vector. The regression coefficients (fixed effects) for the model. |
spatial |
Character. Specifies the type of spatial dependency to use. Options are "ICAR" for Intrinsic Conditional Autoregressive, or "CAR" for Conditional Autoregressive. |
W |
Optional matrix. The adjacency matrix for lattice data. If provided, it will be used to define spatial relationships between knots. |
V |
Optional numeric. The variance of the spatial random effects for lattice data. |
rho |
Optional numeric. The spatial correlation coefficient for the CAR model. Default is 1. |
shapefile |
Optional. A shapefile defining the spatial relationships between knots. If provided, it will be used to define an adjacency matrix. |
A list containing the following components:
A matrix of covariates with the specified number of knots ('n') and columns based on the length of 'beta'.
A vector of spatial random effects, generated based on the specified spatial dependency model ('spatial').
A vector representing the linear predictor, calculated as the dot product of the covariates and coefficients plus the spatial effects ('phi').
A vector of simulated count data based on the GC model and the linear predictor ('eta').
A vector of knot IDs from 1 to 'n'.
# Generate a random adjacency matrix for a 429x429 grid W <- rAdj(429) # Generate data from the GC spatial regression model with the specified parameters data <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), spatial = "ICAR", W = W, V = 1) # View the generated data print(data)
# Generate a random adjacency matrix for a 429x429 grid W <- rAdj(429) # Generate data from the GC spatial regression model with the specified parameters data <- rGClat(n = 200, alpha = 1, beta0 = 0.3, beta = c(-0.5, 0.5), spatial = "ICAR", W = W, V = 1) # View the generated data print(data)
This function generates spatial random fields from Conditional Autoregressive (CAR) models for lattice spatial data. Given a neighborhood matrix, a variance parameter, and a spatial dependence parameter, the function produces a spatial random field.
spatCAR(W, sig, rho)
spatCAR(W, sig, rho)
W |
Numeric matrix. The neighborhood matrix representing the adjacency relationships between spatial units. It can be provided by using the function 'rAdj'. |
sig |
Numeric. The variance of the spatial random effects from the CAR model. Must be positive. |
rho |
Numeric. The spatial dependence parameter for the CAR model. Must be between -1 and 1, inclusive. |
The function starts by computing the diagonal matrix of the number of neighbors for each spatial unit. Then, it calculates the precision matrix (Q) based on the given parameters and neighborhood matrix. A small constant (0.0001) is added to the diagonal to ensure the precision matrix is non-singular. Finally, the covariance matrix is calculated as the inverse of the precision matrix multiplied by the variance parameter ('sig'). The function uses multivariate normal random generation (using 'rmvnorm' from the 'mvtnorm' package) to produce the spatial random field.
A numeric vector representing the spatial random field from the CAR model. The length of the vector is equal to the number of spatial units (rows in the neighborhood matrix).
# Generate a random adjacency matrix for 5 spatial regions with a probability of 0.2 W <- rAdj(n = 5, p = 0.2) # Generate a spatial random field from the CAR model using the adjacency matrix # with parameters variance = 0.1, and rho = 0.5 spatial_random_field <- spatCAR(W = W, sig = 0.1, rho = 0.5) print(spatial_random_field)
# Generate a random adjacency matrix for 5 spatial regions with a probability of 0.2 W <- rAdj(n = 5, p = 0.2) # Generate a spatial random field from the CAR model using the adjacency matrix # with parameters variance = 0.1, and rho = 0.5 spatial_random_field <- spatCAR(W = W, sig = 0.1, rho = 0.5) print(spatial_random_field)
This function generates spatial random fields from the Matern covariance model for geospatial data. Given a number of knots, a variance parameter, and a range parameter, the function produces a spatial random field from the Matern covariance model.
spatGEO(m, sigma, range)
spatGEO(m, sigma, range)
m |
Integer. The number of knots (or spatial units) for which the spatial random field should be generated. |
sigma |
Numeric. The variance of the spatial random effects with Matern covariance. Must be positive. |
range |
Numeric. The range of the spatial random effects with Matern covariance. Must be positive. |
The function starts by converting the variance ('sigma') and range ('range') parameters into the tau and kappa parameters required for the Matern covariance model. Then, it constructs a mesh for the given number of knots, defining the spatial layout of the data. The function uses INLA functions ('inla.spde2.matern' and others) to create an SPDE object and set the priors for the spatial model. It then projects the mesh and constructs a predictor for the model. Finally, the function calculates the precision matrix, samples from the distribution, and computes the spatial random field using the A matrix and sampled values.
A numeric vector representing the spatial random field from the Matern covariance model. The length of the vector is equal to the number of knots specified by the parameter 'm'.
# Generate a spatial random field from the Matern covariance model # using 100 knots, variance = 1, and range = 1 spatial_random_field <- spatGEO(m = 100, sigma = 1, range = 1) print(spatial_random_field)
# Generate a spatial random field from the Matern covariance model # using 100 knots, variance = 1, and range = 1 spatial_random_field <- spatGEO(m = 100, sigma = 1, range = 1) print(spatial_random_field)
This function generates spatial random fields from Intrinsic Conditional Autoregressive (ICAR) models for lattice spatial data. Given an adjacency matrix and a variance parameter, the function produces a spatial random field from the ICAR model.
spatICAR(W, sig = 1)
spatICAR(W, sig = 1)
W |
Numeric matrix. The adjacency matrix representing the adjacency relationships between spatial units. It can be provided by using the function 'generate_adjacency_matrix'. |
sig |
Numeric. The variance of the spatial random effects from the ICAR model. Must be positive. Default value is 1. |
The function starts by computing the degree of each spatial unit (number of neighbors) from the adjacency matrix. Then, it constructs the precision matrix (Q) based on the adjacency matrix. The function computes the eigenvalues and eigenvectors of Q. To generate the spatial random field, it uses a standard normal distribution to generate a vector of random numbers, scales the random numbers by the variance and the inverse of the eigenvalues (excluding the smallest eigenvalue), and then transforms the random numbers using the eigenvectors. The function returns a spatial random field vector.
A numeric vector representing the spatial random field from the ICAR model. The length of the vector is equal to the number of spatial units (columns in the adjacency matrix).
# Define an adjacency matrix for 5 spatial units W <- rAdj(n = 5, p = 0.2) # Generate spatial random field from ICAR model with variance = 1 spatial_random_field_icar <- spatICAR(W = W, sig = 1) print(spatial_random_field_icar)
# Define an adjacency matrix for 5 spatial units W <- rAdj(n = 5, p = 0.2) # Generate spatial random field from ICAR model with variance = 1 spatial_random_field_icar <- spatICAR(W = W, sig = 1) print(spatial_random_field_icar)