Some practical tips on inlabru
inlabruThe general workflow:
inlabruThe general workflow:
component featurescomponent featuresThe basic syntax for components is:
my_component_name: this is a user-chosen label for the model component. This is then used as input in predict() and generate().main: here we define the input data for the component.model: The type of model component see ?component, and ?INLA::inla.list.models()$latent.Examples
beta_0(1) or Intercept(1) define one interceptcov_effect(altitude, model= "linear") defines the linear effect of altitude on the predictor.time_effect(time, model= "rw2") defines the smooth effect of time on the predictor.inlabru shortcutsIn the slides we have fitted the model as:
but in the practical you have
inlabru shortcutsIn the slides we have fitted the model as:
but in the practical you have
bru_obs() function..it can be defines also inside the function!y ~ . just means “take all the components and sum them together”. It also tells the bru() function that your predictor is linear.inlabru provides a dedicated term, Intercept(), for specifying an intercept that is shared across all data.
When this term is omitted, a common intercept is added by default, unless the formula or component explicitly includes -1.
mean sd
beta0 0.445 22.361
beta1 0.736 0.326
Intercept 0.445 22.361
mean sd
Intercept 0.890 0.174
beta1 0.736 0.326
mean sd
beta0 0.890 0.174
beta1 0.736 0.326
inlabru has a specific component type for this that is called fixed.
Let’s look at the mtcars dataset that is in R
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
inlabruWe want to fit a model where gear is the only covariate.
gear has three categories: 3, 4 and 5(Intercept) gear4 gear5
16.106667 8.426667 5.273333
How do we do this in inlabru?
inlabruOption 1: Use the fixed model
ID mean sd
1 (Intercept) 16.103112 1.205027
2 gear4 8.414875 1.807342
3 gear5 5.253896 2.407237
NOTE
gear0(~ gear, model = "fixed") is a lm style model definition! So an intercept is automatically includedsummary.random part of the model, not the summary.fixedinlabru[1] 16.04855 24.42278 21.15033
What is happening here??
inlabruAlternative (and equivalent) coding options
NOTE
When using the fixed model, inlabru internally creates a model matrix using the function MatrixModels::model.Matrix(). The syntax
is equivalent to
inlabruHere the easiest option is to use again the fixed model type.
Example 1
Say we want to fit a model with interactions between (categorical) gear and (bivariate) vs.
In lm we can do
Similarly in inlabru
[1] 15.0434624 5.8986288 4.0890388 5.2876289 -0.9880638 5.8809905
[1] 1.179235 3.100019 2.353196 2.618198 4.074736 5.205994
inlabruExample 2
Another example, this time interaction between (categorical) gear and (continuous) disp.
Using lm:
(Intercept) gear4 gear5 disp gear4:disp gear5:disp
24.51556635 15.05196338 7.14538044 -0.02577046 -0.09644222 -0.02500467
inlabruWe can also use the idea that “fixed effects are just random effects with fixed precision” 😄
NOTE: This idea can be very useful in more complex contexts where the fixed model does not work (for example if one wants to have spatially varying effect of a covariate!)
Let’s look at the interaction between (categorical) gear and (continuous) disp.
\[ \begin{aligned} \eta_i & = \beta_0 + \beta_{0,\text{GEAR}_i} + \beta_1x_i + \beta_{1,\text{GEAR}_i} x_i\\ & = \beta^*_{0,\text{GEAR}_i} +\beta^*_{1,\text{GEAR}_i} x_i \end{aligned} \]
inlabruWe can also use the idea that “fixed effects are just random effects with fixed precision” 😄
NOTE: This idea can be very useful in more complex contexts where the fixed model does not work (for example if one wants to have spatially varying effect of a covariate!)
Let’s look at the interaction between (categorical) gear and (continuous) disp.
\[ \begin{aligned} \eta_i & = \beta_0 + \beta_{0,\text{GEAR}_i} + \beta_1x_i + \beta_{1,\text{GEAR}_i} x_i\\ & = \underbrace{\beta^*_{0,\text{GEAR}_i}}_{\text{iid}} +\underbrace{\beta^*_{1,\text{GEAR}_i}}_{\text{iid}}\ \underbrace{x_i}_{\text{weights}} \end{aligned} \]
inlabruThe model for the linear predictor is:
[1] 24.15664 38.93821 31.16917
[1] -0.02475087 -0.11752667 -0.04884794
result objectThe fitted model is stored in an inlabru object
inlabru version: 2.13.0.9033
INLA version: 26.02.06
Latent components:
Intercept: main = linear(1)
sex: main = iid(sex)
time: main = linear(yrs.since.phd)
Observation models:
Model tag: <No tag>
Family: 'gaussian'
Data class: 'data.frame'
Response class: 'numeric'
Predictor: salary ~ Intercept + sex + time
Additive/Linear/Rowwise: TRUE/TRUE/TRUE
Used components: effect[Intercept, sex, time], latent[]
Time used:
Pre = 0.492, Running = 0.19, Post = 0.014, Total = 0.696
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
Intercept 73.634 13.214 47.722 73.634 99.546 73.634 0
time 0.976 0.108 0.764 0.976 1.188 0.976 0
Random effects:
Name Model
sex IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for the Gaussian observations 0.001 0.00 0.001 0.001
0.975quant mode
Precision for the Gaussian observations 0.002 0.001
Marginal log-Likelihood: -1909.39
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Some results are very easy to get as they are contained in the result object:
mean sd 0.025quant 0.5quant 0.975quant mode
Intercept 73.6338668 13.2142639 47.7218377 73.6339188 99.545600 73.6339190
time 0.9758411 0.1079614 0.7641549 0.9757971 1.187779 0.9757972
kld
Intercept 5.468309e-11
time 3.966806e-10
x y
[1,] 17.31393 3.445719e-06
[2,] 24.47190 2.993035e-05
[3,] 32.77900 2.546511e-04
predict() and generate() functionsOften one wants more “complex” results from the model.
For example if we fit to the mtcar data, the model
\[ \eta_i = \beta_{0,\text{GEAR}_i} + \beta_{1,\text{GEAR}_i}x_i,\ i =1,\dots,n \]
We might want to recover the three regression lines
\[ \beta_{0,\text{GEAR}_i} + \beta_{1,\text{GEAR}_i}x_i ,\ \text{GEAR} = 3,4,5 \]
To do this we can use the predict() and generate() functions.
predict() simulates from the fitted posterior distribution \(\pi(\mathbf{y}|\mathbf{u},\theta)\), computes what you ask and produces a summary of the samples (mean, sd, quantiles, etc)
generate() simulates from the fitted posterior distribution \(\pi(\mathbf{y}|\mathbf{u},\theta)\), computes what you ask and return the samples!
gear disp mean sd q0.025 q0.5 q0.975 median
1 3 71 22.39604 2.034613 18.99907 22.54192 26.31040 22.54192
2 3 72 22.37112 2.027809 18.98687 22.51678 26.27229 22.51678
3 3 73 22.34619 2.021007 18.97468 22.49164 26.23418 22.49164
4 3 74 22.32127 2.014208 18.96248 22.46650 26.19608 22.46650
5 3 75 22.29635 2.007411 18.95029 22.44135 26.15797 22.44135
mean.mc_std_err sd.mc_std_err
1 0.2369176 0.1672814
2 0.2361205 0.1666981
3 0.2353237 0.1661147
4 0.2345271 0.1655314
5 0.2337307 0.1649481
gear disp mean sd q0.025 q0.5 q0.975 median
1 3 71 22.20469 2.082109 17.86798 22.45977 25.67829 22.45977
2 3 72 22.18086 2.074446 17.85878 22.43372 25.64251 22.43372
3 3 73 22.15703 2.066786 17.84957 22.40766 25.60673 22.40766
4 3 74 22.13320 2.059129 17.84037 22.38161 25.57094 22.38161
5 3 75 22.10936 2.051474 17.83117 22.35555 25.53516 22.35555
mean.mc_std_err sd.mc_std_err
1 0.2387793 0.1528418
2 0.2379132 0.1523430
3 0.2370475 0.1518446
4 0.2361822 0.1513464
5 0.2353171 0.1508486
Let’s fit the model for the Tokyo rainfall data \[ \begin{aligned} y_t|\eta_t&\sim\text{Bin}(n_t, p_t),\qquad i = 1,\dots,366\\ \eta_t &= \text{logit}(p_t)= \beta_0 + f(\text{time}_t) \end{aligned} \]
We now want to extract results…what do we want?
We can get all of them with the predict() or generate() functions!
predict()or
y n time mean sd q0.025 q0.5 q0.975 median mean.mc_std_err sd.mc_std_err
1 0 2 1 0.177 0.084 0.059 0.159 0.377 0.159 0.003 0.003
2 0 2 2 0.174 0.079 0.060 0.158 0.358 0.158 0.003 0.002
3 1 2 3 0.172 0.075 0.063 0.157 0.343 0.157 0.003 0.002
predict()generate() time_eff lin_pred probs
1 0.145248856 -0.9073913 0.2875340
2 -0.008300456 -1.0609406 0.2571298
3 -0.126704192 -1.1793443 0.2351701
generate()_latent suffixSometimes, one wants to sample directly from latent parameters without specifying any new input data (or, in other words, to sample with an identity projection matrix)
The keyword _latent can be appended to a latent component name to retrieve that
For example, the following code recovers the intercept and the time effect at day 13
NOTE Here the newdata is just an empty object
We can be interested in functions of the posterior marginals that are computed by the bru() function.
For example, we might want to :
All this can be done with the help of the inla.*marginal() family of functions
Obtain the posterior marginal for the standard deviation of the random effect, compute it’s mean and median and sample from it.
inla.*marginal() family| Function Name | Usage |
|---|---|
inla.dmarginal(x, marginal, ...) |
Density at a vector of evaluation points \(x\) |
inla.pmarginal(q, marginal, ...) |
Distribution function at a vector of quantiles q |
inla.qmarginal(p, marginal, ...) |
Quantile function at a vector of probabilities p |
inla.rmarginal(n, marginal) |
Generate n random deviates |
inla.hpdmarginal(p, marginal, ...) |
Compute the highest posterior density interval at level p |
inla.emarginal(fun, marginal, ...) |
Compute the expected value assuming transformation given by fun |
inla.mmarginal(marginal) |
Computes the mode |
inla.smarginal(marginal, ...) |
Smoothed density (returns x-values and interpolated y-values) |
inla.tmarginal(fun, marginal, ...) |
Transform the marginal using function fun |
inla.zmarginal(marginal) |
Summary statistics for the marginal |
NAs in inlabruinlabruThe bru() function treats missing values in the dataset differently according to their role in the model.
Take the model definition:
NA’s in the response y
If y[i] = NA, this means that y[i] is not observed, hence gives no contribution to the likelihood.
NA’s in fixed effect x
If x[i] = NA this means that x[i] is not part of the linear predictor for y[i]. For fixed effects, this is equivalent to x[i]=0, hence internally we make this change: x[is.na(x)] = 0
NA’s in random effect z
If z[i] = NA, this means that the random effect does not contribute to the linear predictor for y[i].
inlabru deal with missing covariates?No, inlabru has no way to `impute’ or integrate-out missing covariates.
You have to adjust your model to account for missing covariates.
Sometimes, you can formulate a joint model for the data and the covariates, but this is case-specific.
model = "fixed"When we use the model = "fixed" missing values can give error:
Error in `iinla()`:
! The total number of response values (N=5) and predictor values (N=4) do not match.
This is likely due to a mistake in the component or predictor constructions.
The problem here is related to the model.matrix() function that is internally run and that, by default, removes the missing values
This makes the length of the response (5) different from the length of the covariate (4).
One solution is to force model.matrix() not to remove the missing values
ID mean sd 0.025quant
1 (Intercept) 0.5621983 0.7106791 -0.8834028
2 xb -0.7319059 1.1232657 -3.0150691
WARNING this is a global options, it might interfere with other part of your code!
inlabruYou might wonder which priors have you used in your model..
NB inla.priors.used is a function of the INLA package, so you need to load it explicitly.
The inla.doc() function provides help on the different prior, latent models and likelihood.
Likelihoods
Section [likelihood]
0binomial New 0-inflated Binomial
0binomialS New 0-inflated Binomial Swap
0poisson New 0-inflated Poisson
0poissonS New 0-inflated Poisson Swap
1poisson New 1-inflated Poisson
1poissonS New 1-inflated Poisson Swap
agaussian The aggregated Gaussian likelihoood
bcgaussian The Box-Cox Gaussian likelihoood
bell The Bell likelihood
beta The Beta likelihood
betabinomial The Beta-Binomial likelihood
betabinomialna The Beta-Binomial Normal approximation likelihood
bgev The blended Generalized Extreme Value likelihood
binomial The Binomial likelihood
binomialmix Binomial mixture
cbinomial The clustered Binomial likelihood
cennbinomial2 The CenNegBinomial2 likelihood (similar to cenpoisson2)
cenpoisson Then censored Poisson likelihood
cenpoisson2 Then censored Poisson likelihood (version 2)
circularnormal The circular Gaussian likelihoood
cloglike User-defined likelihood
coxph Cox-proportional hazard likelihood
dgompertzsurv destructive gompertz (survival) distribution
dgp Discrete generalized Pareto likelihood
egp Exteneded Generalized Pareto likelihood
exponential The Exponential likelihood
exponentialsurv The Exponential likelihood (survival)
exppower The exponential power likelihoood
fl The fl likelihood
fmri fmri distribution (special nc-chi)
fmrisurv fmri distribution (special nc-chi)
gamma The Gamma likelihood
gammacount A Gamma generalisation of the Poisson likelihood
gammajw A special case of the Gamma likelihood
gammajwsurv A special case of the Gamma likelihood (survival)
gammasurv The Gamma likelihood (survival)
gammasv The Gamma likelihood with constant rate
gaussian The Gaussian likelihoood
gaussianjw The GaussianJW likelihoood
gev The Generalized Extreme Value likelihood
ggaussian Generalized Gaussian
ggaussianS Generalized GaussianS
gompertz gompertz distribution
gompertzsurv gompertz distribution
gp Generalized Pareto likelihood
gpoisson The generalized Poisson likelihood
iidgamma (experimental)
iidlogitbeta (experimental)
lavm Link adjusted von Mises circular distribution
loggammafrailty (experimental)
logistic The Logistic likelihoood
loglogistic The loglogistic likelihood
loglogisticsurv The loglogistic likelihood (survival)
lognormal The log-Normal likelihood
lognormalsurv The log-Normal likelihood (survival)
logperiodogram Likelihood for the log-periodogram
mgamma The modal Gamma likelihood
mgammasurv The modal Gamma likelihood (survival)
nbinomial The negBinomial likelihood
nbinomial2 The negBinomial2 likelihood
nmix Binomial-Poisson mixture
nmixnb NegBinomial-Poisson mixture
npoisson The Normal approximation to the Poisson likelihood
nvm Normal approx of the von Mises circular distribution
nzpoisson The nzPoisson likelihood
obeta The ordered Beta likelihood
occupancy Occupancy likelihood
poisson The Poisson likelihood
poisson.special1 The Poisson.special1 likelihood
pom Likelihood for the proportional odds model
qkumar A quantile version of the Kumar likelihood
qloglogistic A quantile loglogistic likelihood
qloglogisticsurv A quantile loglogistic likelihood (survival)
rcpoisson Randomly censored Poisson
sem The SEM likelihoood
simplex The simplex likelihood
sn The Skew-Normal likelihoood
stdgaussian The stdGaussian likelihoood
stochvol The Gaussian stochvol likelihood
stochvolln The Log-Normal stochvol likelihood
stochvolnig The Normal inverse Gaussian stochvol likelihood
stochvolsn The SkewNormal stochvol likelihood
stochvolt The Student-t stochvol likelihood
t Student-t likelihood
tpoisson Thinned Poisson
tstrata A stratified version of the Student-t likelihood
tweedie Tweedie distribution
vm von Mises circular distribution
weibull The Weibull likelihood
weibullsurv The Weibull likelihood (survival)
wrappedcauchy The wrapped Cauchy likelihoood
xbinomial The Binomial likelihood (experimental version)
xpoisson The Poisson likelihood (expert version)
zeroinflatedbetabinomial0 Zero-inflated Beta-Binomial, type 0
zeroinflatedbetabinomial1 Zero-inflated Beta-Binomial, type 1
zeroinflatedbetabinomial2 Zero inflated Beta-Binomial, type 2
zeroinflatedbinomial0 Zero-inflated Binomial, type 0
zeroinflatedbinomial1 Zero-inflated Binomial, type 1
zeroinflatedbinomial2 Zero-inflated Binomial, type 2
zeroinflatedcenpoisson0 Zero-inflated censored Poisson, type 0
zeroinflatedcenpoisson1 Zero-inflated censored Poisson, type 1
zeroinflatednbinomial0 Zero inflated negBinomial, type 0
zeroinflatednbinomial1 Zero inflated negBinomial, type 1
zeroinflatednbinomial1strata2 Zero inflated negBinomial, type 1, strata 2
zeroinflatednbinomial1strata3 Zero inflated negBinomial, type 1, strata 3
zeroinflatednbinomial2 Zero inflated negBinomial, type 2
zeroinflatedpoisson0 Zero-inflated Poisson, type 0
zeroinflatedpoisson1 Zero-inflated Poisson, type 1
zeroinflatedpoisson2 Zero-inflated Poisson, type 2
zeroninflatedbinomial2 Zero and N inflated binomial, type 2
zeroninflatedbinomial3 Zero and N inflated binomial, type 3
Components
Section [latent]
2diid (This model is obsolute)
ar Auto-regressive model of order p (AR(p))
ar1 Auto-regressive model of order 1 (AR(1))
ar1c Auto-regressive model of order 1 w/covariates
besag The Besag area model (CAR-model)
besag2 The shared Besag model
besagproper A proper version of the Besag model
besagproper2 An alternative proper version of the Besag model
bym The BYM-model (Besag-York-Mollier model)
bym2 The BYM-model with the PC priors
cgeneric Generic latent model specified using C
clinear Constrained linear effect
copy Create a copy of a model component
crw2 Exact solution to the random walk of order 2
dmatern Dense Matern field
fgn Fractional Gaussian noise model
fgn2 Fractional Gaussian noise model (alt 2)
generic A generic model
generic0 A generic model (type 0)
generic1 A generic model (type 1)
generic2 A generic model (type 2)
generic3 A generic model (type 3)
iid Gaussian random effects in dim=1
iid1d Gaussian random effect in dim=1 with Wishart prior
iid2d Gaussian random effect in dim=2 with Wishart prior
iid3d Gaussian random effect in dim=3 with Wishart prior
iid4d Gaussian random effect in dim=4 with Wishart prior
iid5d Gaussian random effect in dim=5 with Wishart prior
iidkd Gaussian random effect in dim=k with Wishart prior
intslope Intecept-slope model with Wishart-prior
linear Alternative interface to an fixed effect
log1exp A nonlinear model of a covariate
logdist A nonlinear model of a covariate
matern2d Matern covariance function on a regular grid
meb Berkson measurement error model
mec Classical measurement error model
ou The Ornstein-Uhlenbeck process
prw2 Proper random walk of order 2
revsigm Reverse sigmoidal effect of a covariate
rgeneric Generic latent model specified using R
rw1 Random walk of order 1
rw2 Random walk of order 2
rw2d Thin-plate spline model
rw2diid Thin-plate spline with iid noise
scopy Create a scopy of a model component
seasonal Seasonal model for time series
sigm Sigmoidal effect of a covariate
slm Spatial lag model
spde A SPDE model
spde2 A SPDE2 model
spde3 A SPDE3 model
z The z-model in a classical mixed model formulation
Priors
Section [prior]
betacorrelation Beta prior for the correlation
dirichlet Dirichlet prior
expression: A generic prior defined using expressions
flat A constant prior
gamma Gamma prior
gaussian Gaussian prior
invalid Void prior
jeffreystdf Jeffreys prior for the doc
laplace Laplace prior
linksnintercept Skew-normal-link intercept-prior
logflat A constant prior for log(theta)
loggamma Log-Gamma prior
logiflat A constant prior for log(1/theta)
logitbeta Logit prior for a probability
logtgaussian Truncated Gaussian prior
logtnormal Truncated Normal prior
minuslogsqrtruncnormal (obsolete)
mvnorm A multivariate Normal prior
none No prior
normal Normal prior
pc Generic PC prior
pc.alphaw PC prior for alpha in Weibull
pc.ar PC prior for the AR(p) model
pc.cor0 PC prior correlation, basemodel cor=0
pc.cor1 PC prior correlation, basemodel cor=1
pc.dof PC prior for log(dof-2)
pc.egptail PC prior for the tail in the EGP likelihood
pc.fgnh PC prior for the Hurst parameter in FGN
pc.gamma PC prior for a Gamma parameter
pc.gammacount PC prior for the GammaCount likelihood
pc.gevtail PC prior for the tail in the GEV likelihood
pc.matern PC prior for the Matern SPDE
pc.mgamma PC prior for a Gamma parameter
pc.prec PC prior for log(precision)
pc.prw2.range PCprior for the range in PRW2
pc.range PC prior for the range in the Matern SPDE
pc.sn PC prior for the skew-normal
pc.spde.GA (experimental)
pom #classes-dependent prior for the POM model
ref.ar Reference prior for the AR(p) model, p<=3
rprior: A R-function defining the prior
table: A generic tabulated prior
wishart1d Wishart prior dim=1
wishart2d Wishart prior dim=2
wishart3d Wishart prior dim=3
wishart4d Wishart prior dim=4
wishart5d Wishart prior dim=5
wishartkd Wishart prior
Linear models with interactions
inlabru has a special model to implemement thispredict() and generate() functions are powerful tools (based on sampling from the posterior distribution) to obtain many interesting results!
missing values are usually not a problem but in some cases have to be treated carefully.