MCMC Sampler sampler for the Hidden Markov with Gamma-Poisson emission densities
Source:R/MCMC_poisson.R
hmm_mcmc_gamma_poisson.Rd
MCMC Sampler sampler for the Hidden Markov with Gamma-Poisson emission densities
Usage
hmm_mcmc_gamma_poisson(
data,
prior_T,
prior_betas,
prior_alpha = 1,
iter = 5000,
warmup = floor(iter/1.5),
thin = 1,
seed = sample.int(.Machine$integer.max, 1),
init_T = NULL,
init_betas = NULL,
init_alpha = NULL,
print_params = TRUE,
verbose = TRUE
)
Arguments
- data
(numeric) data
- prior_T
(matrix) prior transition matrix
- prior_betas
(numeric) prior beta parameters
- prior_alpha
(numeric) a single prior alpha parameter. By default,
prior_alpha=1
- iter
(integer) number of MCMC iterations
- warmup
(integer) number of warmup iterations
- thin
(integer) thinning parameter. By default,
1
- seed
(integer) seed parameter
- init_T
(matrix)
optional parameter
; initial transition matrix- init_betas
(numeric)
optional parameter
; initial beta parameters- init_alpha
(numeric)
optional parameter
; initial alpha parameter- print_params
(logical)
optional parameter
; print estimated parameters every iteration. By default,TRUE
- verbose
(logical)
optional parameter
; print additional messages. By default,TRUE
Value
List with following elements:
data
: data used for simulationsamples
: list with samplesestimates
: list with various estimatesidx
: indices with iterations after the warmup periodpriors
: prior parametersinits
: initial parameterslast_iter
: list with samples from the last MCMC iterationinfo
: list with various meta information about the object
Details
Please see supplementary information at doi:10.1186/s12859-024-05751-4 for more details on the algorithm.
For usage recommendations please see https://github.com/LynetteCaitlin/oHMMed/blob/main/UsageRecommendations.pdf.
References
Claus Vogl, Mariia Karapetiants, Burçin Yıldırım, Hrönn Kjartansdóttir, Carolin Kosiol, Juraj Bergman, Michal Majka, Lynette Caitlin Mikula. Inference of genomic landscapes using ordered Hidden Markov Models with emission densities (oHMMed). BMC Bioinformatics 25, 151 (2024). doi:10.1186/s12859-024-05751-4
Examples
# Simulate Poisson-Gamma data
N <- 2^10
true_T <- rbind(c(0.95, 0.05, 0),
c(0.025, 0.95, 0.025),
c(0.0, 0.05, 0.95))
true_betas <- c(2, 1, 0.1)
true_alpha <- 1
simdata_full <- hmm_simulate_gamma_poisson_data(L = N,
mat_T = true_T,
betas = true_betas,
alpha = true_alpha)
simdata <- simdata_full$data
hist(simdata, breaks = 40, probability = TRUE,
main = "Distribution of the simulated Poisson-Gamma data")
lines(density(simdata), col = "red")
# Set numbers of states to be inferred
n_states_inferred <- 3
# Set priors
prior_T <- generate_random_T(n_states_inferred)
prior_betas <- c(1, 0.5, 0.1)
prior_alpha <- 3
# Simmulation settings
iter <- 50
warmup <- floor(iter / 5) # 20 percent
thin <- 1
seed <- sample.int(10000, 1)
print_params <- FALSE # if TRUE then parameters are printed in each iteration
verbose <- FALSE # if TRUE then the state of the simulation is printed
# Run MCMC sampler
res <- hmm_mcmc_gamma_poisson(data = simdata,
prior_T = prior_T,
prior_betas = prior_betas,
prior_alpha = prior_alpha,
iter = iter,
warmup = warmup,
thin = thin,
seed = seed,
print_params = print_params,
verbose = verbose)
res
#> Model: HMM Gamma-Poisson
#> Type: MCMC
#> Iter: 50
#> Warmup: 10
#> Thin: 1
#> States: 3
summary(res)# summary output can be also assigned to a variable
#> Estimated betas:
#> beta[1] beta[2] beta[3]
#> 2.2484066 0.4223809 0.1024932
#>
#> Estimated alpha:
#> 1.942793
#>
#> Estimated means:
#> 0.8647647 4.867628 19.17357
#>
#> Estimated transition rates:
#> 1 2 3
#> 1 0.9759473 0.02405272 0.0000000
#> 2 0.1104292 0.59833147 0.2912393
#> 3 0.0000000 0.55822747 0.4417725
#>
#> Number of windows assigned to hidden states:
#> 1 2 3
#> 787 167 70
#>
#> Kullback-Leibler divergence between observed and estimated distributions:
#> 0.3744706
#>
#> Log Likelihood:
#> mean sd median
#> -1941.306431 2.308304 -1941.449830
#>
#> P-value of poisson test for difference between rates of states (stepwise):
#> 1-2 2-3
#> 3.252229e-227 0.000000e+00
#>
coef(res) # extract model estimates
#> $betas
#> beta[1] beta[2] beta[3]
#> 2.2484066 0.4223809 0.1024932
#>
#> $alpha
#> [1] 1.942793
#>
#> $means
#> means[1] means[2] means[3]
#> 0.8647647 4.8676278 19.1735652
#>
#> $mat_T
#> [,1] [,2] [,3]
#> [1,] 0.9759473 0.02405272 0.0000000
#> [2,] 0.1104292 0.59833147 0.2912393
#> [3,] 0.0000000 0.55822747 0.4417725
#>
# plot(res) # MCMC diagnostics