Sunday, November 2, 2014

Tuning Laplaces Demon IV

This is the last post of testing Laplaces Demon algorithms. In the last algorithms there are some which are completely skipped because they are not suitable for the problem. Reversible Jump is for variable selection. Sequential Metropolis-within-Gibbs, Updating Sequential Metropolis-within-Gibbs and their adaptive versions are for state space models. Stochastic Gradient Langevin Dynamics is for big data. Having these algorithms does show that Laplaces Demon is a versatile package.

Refractive

This algorithm has only one step size parameter. Since the two model parameters have such a different scale, this effects the parameters to have totally different behavior. One parameter, beta1, has loads of space, wanders around. The other parameter, beta2, has a clear location, with spikes sticking out both p and down. Since the two parameters are correlated, there is still some additional drift in beta2. In terms of beta1, I can agree that the current sampling should have a higher thinning and hence a longer run.  
 

Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Algorithm = "Refractive", Specs = list(Adaptive = 1, m = 3,
        w = 0.001, r = 1.3))

Acceptance Rate: 0.6534
Algorithm: Refractive Sampler
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
     beta[1]      beta[2]
0.4844556519 0.0005970499

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
         All Stationary
Dbar  47.838     46.571
pD   134.976     21.179
DIC  182.814     67.750
Initial Values:
[1] -10   0

Iterations: 10000
Log(Marginal Likelihood): NA
Minutes of run-time: 0.14
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 800
Recommended Burn-In of Un-thinned Samples: 8000
Recommended Thinning: 30
Specs: (NOT SHOWN HERE)
Status is displayed every 100 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 1000
Thinning: 10


Summary of All Samples
                Mean         SD        MCSE        ESS          LB      Median
beta[1]  -10.8567577  0.6958495 0.233356594   1.824355 -12.1380065 -10.5879640
beta[2]    0.2679953  0.0229309 0.001917826   5.662208   0.2274266   0.2643966
Deviance  47.8375772 16.4302426 0.561644688 902.729104  42.5294569  44.1778224
LP       -32.7236336  8.2147402 0.280765300 902.836669 -45.3216193 -30.8908909
                  UB
beta[1]   -9.9915826
beta[2]    0.3122168
Deviance  73.0419916
LP       -30.0712861


Summary of Stationary Samples
                Mean         SD        MCSE        ESS          LB      Median
beta[1]  -11.9782481 0.15233913 0.073922711   4.501243 -12.2286033 -12.0057957
beta[2]    0.2968389 0.01324154 0.001397488 148.697808   0.2708737   0.2966517
Deviance  46.5714377 6.50825601 0.610732523 200.000000  42.5613730  44.0793108
LP       -32.1031461 3.25402458 0.280765300 200.000000 -42.0542422 -30.8570180
                  UB
beta[1]  -11.6295689
beta[2]    0.3221553
Deviance  66.4714655
LP       -30.0980992

Reversible-Jump 

The manual states: 'This version of reversible-jump is intended for variable selection and Bayesian Model Averaging (BMA)'. Hence I am skipping this algorithm.

Robust Adaptive Metropolis

Not intended as final method, it did get me close to target pretty quickly.
Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Algorithm = "RAM", Specs = list(alpha.star = 0.234, Dist = "N",
        gamma = 0.66))

Acceptance Rate: 0.7803
Algorithm: Robust Adaptive Metropolis
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
   beta[1]    beta[2]
  3887.959 492865.306

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
         All Stationary
Dbar  43.502     42.582
pD   332.753      0.000
DIC  376.255     42.582
Initial Values:
[1] -10   0

Iterations: 10000
Log(Marginal Likelihood): NA
Minutes of run-time: 0.09
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 100
Recommended Burn-In of Un-thinned Samples: 1000
Recommended Thinning: 60
Specs: (NOT SHOWN HERE)
Status is displayed every 100 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 1000
Thinning: 10


Summary of All Samples
                Mean          SD         MCSE      ESS          LB      Median
beta[1]  -12.0385096  0.10860616 0.0103508706 238.0731 -12.1374967 -12.0418931
beta[2]    0.2984386  0.01030721 0.0006992242 392.4126   0.2988978   0.2988978
Deviance  43.5022371 25.79738892 0.9975871047 784.2517  42.5818410  42.5818410
LP       -30.5692642 12.89790900 0.4986799500 784.3937 -30.1323235 -30.1091011
                  UB
beta[1]  -12.0418931
beta[2]    0.3008871
Deviance  42.6259729
LP       -30.1091011


Summary of Stationary Samples
                Mean SD      MCSE          ESS          LB      Median
beta[1]  -12.0418931  0 0.0000000 2.220446e-16 -12.0418931 -12.0418931
beta[2]    0.2988978  0 0.0000000 2.220446e-16   0.2988978   0.2988978
Deviance  42.5818410  0 0.0000000 2.220446e-16  42.5818410  42.5818410
LP       -30.1091011  0 0.4986799 2.220446e-16 -30.1091011 -30.1091011
                  UB
beta[1]  -12.0418931
beta[2]    0.2988978
Deviance  42.5818410
LP       -30.1091011

Sequential Adaptive Metropolis-within-Gibbs

The manual states: 'The Sequential Adaptive Metropolis-within-Gibbs (SAMWG) algorithm is for state-space models (SSMs), including dynamic linear models (DLMs)'. Hence skipped.

Sequential Metropolis-within-Gibbs

Not surprising, also for state space models

Slice

The speed of this algorithm was very dependent on the w variable in the specs. A value of 1 did not give any samples in 8 minutes, after which I stopped the algorithm. I took more reasonable values (0.1,0.001) gave much better speed. Then based on samples from that run, I took three times the standard deviation.

Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Iterations = 20000, Status = 2000, Thinning = 180, Algorithm = "Slice",
    Specs = list(m = Inf, w = c(6, 0.15)))

Acceptance Rate: 1
Algorithm: Slice Sampler
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
    beta[1]     beta[2]
4.846007889 0.003955124

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
        All Stationary
Dbar 44.605     44.605
pD    2.750      2.750
DIC  47.355     47.355
Initial Values:
[1] -10   0

Iterations: 20000
Log(Marginal Likelihood): NA
Minutes of run-time: 0.35
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 0
Recommended Burn-In of Un-thinned Samples: 0
Recommended Thinning: 360
Specs: (NOT SHOWN HERE)
Status is displayed every 2000 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 111
Thinning: 180


Summary of All Samples
                Mean         SD        MCSE ESS          LB     Median
beta[1]  -11.7318448 2.20523008 0.261103387 111 -16.5667692 -11.744123
beta[2]    0.2906884 0.05683097 0.006623955 111   0.1904892   0.290367
Deviance  44.6050665 2.34519974 0.217077260 111  42.5298124  43.979482
LP       -31.1194371 1.18126082 0.109764192 111 -34.6093217 -30.802723
                  UB
beta[1]   -7.7817803
beta[2]    0.4118796
Deviance  51.4744628
LP       -30.0779865


Summary of Stationary Samples
                Mean         SD        MCSE ESS          LB     Median
beta[1]  -11.7318448 2.20523008 0.261103387 111 -16.5667692 -11.744123
beta[2]    0.2906884 0.05683097 0.006623955 111   0.1904892   0.290367
Deviance  44.6050665 2.34519974 0.217077260 111  42.5298124  43.979482
LP       -31.1194371 1.18126082 0.109764192 111 -34.6093217 -30.802723
                  UB
beta[1]   -7.7817803
beta[2]    0.4118796
Deviance  51.4744628
LP       -30.0779865

Stochastic Gradient Langevin Dynamics

Manual states it is 'specifically designed for big data'. It reads chunks of data from a csv. Not the algorithm I would chose for this algorithm for this particular problem.

Tempered Hamiltonian Monte Carlo

This is described as a difficult algorithm. I found it most easy to get the epsilon parameter from HMC. That acceptance ratio was low, but a little tweaking gave a reasonable acceptance ratio.
 

Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Iterations = 20000, Status = 2000, Thinning = 30, Algorithm = "THMC",
    Specs = list(epsilon = 1 * c(0.1, 0.001), L = 2, Temperature = 2))

Acceptance Rate: 0.81315
Algorithm: Tempered Hamiltonian Monte Carlo
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
    beta[1]     beta[2]
4.097208702 0.002865553

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
        All Stationary
Dbar 44.604     44.297
pD    5.194      1.239
DIC  49.798     45.537
Initial Values:
[1] -10   0

Iterations: 20000
Log(Marginal Likelihood): NA
Minutes of run-time: 0.24
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 396
Recommended Burn-In of Un-thinned Samples: 11880
Recommended Thinning: 28
Specs: (NOT SHOWN HERE)
Status is displayed every 2000 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 666
Thinning: 30


Summary of All Samples
                Mean         SD       MCSE      ESS          LB     Median
beta[1]  -11.3484317 2.02500365 0.59031391 15.46143 -14.6424427 -11.367700
beta[2]    0.2811592 0.05245157 0.01542082 17.58824   0.1774371   0.283088
Deviance  44.6038844 3.22298411 0.52480075 53.87990  42.4834496  43.819153
LP       -31.1140562 1.60615315 0.26083225 54.26093 -34.1794035 -30.724503
                  UB
beta[1]   -7.4119854
beta[2]    0.3687353
Deviance  50.7269596
LP       -30.0508992


Summary of Stationary Samples
                Mean         SD       MCSE       ESS          LB      Median
beta[1]  -12.6249184 1.44154668 0.56557266  8.986556 -15.2654125 -12.6608107
beta[2]    0.3142047 0.03745808 0.01483015 10.253999   0.2391209   0.3140532
Deviance  44.2973967 1.57433150 0.19032460 52.886367  42.4824664  43.9576438
LP       -30.9751102 0.79587121 0.26083225 49.213284 -33.2191766 -30.8002036
                  UB
beta[1]   -9.7109253
beta[2]    0.3803829
Deviance  48.8130032
LP       -30.0510080

twalk

For this algorithm I could use the defaults.


Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Iterations = 20000, Status = 2000, Thinning = 30, Algorithm = "twalk",
    Specs = list(SIV = NULL, n1 = 4, at = 6, aw = 1.5))

Acceptance Rate: 0.1725
Algorithm: t-walk
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
    beta[1]     beta[2]
3.164272524 0.002344353

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
         All Stationary
Dbar  46.478     44.191
pD   451.235      1.437
DIC  497.713     45.628
Initial Values:
[1] -10   0

Iterations: 20000
Log(Marginal Likelihood): -21.94976
Minutes of run-time: 0.07
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 66
Recommended Burn-In of Un-thinned Samples: 1980
Recommended Thinning: 270
Specs: (NOT SHOWN HERE)
Status is displayed every 2000 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 666
Thinning: 30


Summary of All Samples
                Mean          SD       MCSE       ESS          LB      Median
beta[1]  -11.3588099  1.77939835 0.17301615  78.74936 -15.4171272 -11.0809062
beta[2]    0.2815837  0.04721043 0.00447975  84.80969   0.2051141   0.2748966
Deviance  46.4781029 30.04114466 3.11566520 156.73140  42.5026296  43.4851134
LP       -32.0508166 15.01964492 1.55762515 156.89660 -33.5951401 -30.5589320
                  UB
beta[1]   -8.3578718
beta[2]    0.3836584
Deviance  49.5203423
LP       -30.0633502


Summary of Stationary Samples
               Mean         SD        MCSE      ESS          LB      Median
beta[1]  -11.541692 1.78219147 0.161466015 195.5256 -15.5123754 -11.3408332
beta[2]    0.286237 0.04610116 0.004174498 198.1418   0.2024692   0.2797698
Deviance  44.191483 1.69526231 0.126968354 285.2395  42.4956818  43.6625138
LP       -30.909607 0.85274635 1.557625154 281.1672 -33.0747734 -30.6320642
                  UB
beta[1]   -8.3047902
beta[2]    0.3863038
Deviance  48.5623575
LP       -30.0609567

Univariate Eigenvector Slice Sampler

This is an adaptive algorithm. I chose A to ensure that the latter part of the samples was not adaptive any more. This also involved tweaking thinning and number of samples. The plot shows beta[2] is more difficult to sample.

Call:
LaplacesDemon(Model = Model, Data = MyData, Initial.Values = Initial.Values,
    Iterations = 40000, Status = 2000, Thinning = 50, Algorithm = "UESS",
    Specs = list(A = 50, B = NULL, m = 100, n = 0))

Acceptance Rate: 1
Algorithm: Univariate Eigenvector Slice Sampler
Covariance Matrix: (NOT SHOWN HERE; diagonal shown instead)
 beta[1]  beta[2]
6.918581 0.000000

Covariance (Diagonal) History: (NOT SHOWN HERE)
Deviance Information Criterion (DIC):
        All Stationary
Dbar 47.762     44.035
pD   50.213      0.813
DIC  97.975     44.848
Initial Values:
[1] -10   0

Iterations: 40000
Log(Marginal Likelihood): NA
Minutes of run-time: 3.8
Model: (NOT SHOWN HERE)
Monitor: (NOT SHOWN HERE)
Parameters (Number of): 2
Posterior1: (NOT SHOWN HERE)
Posterior2: (NOT SHOWN HERE)
Recommended Burn-In of Thinned Samples: 640
Recommended Burn-In of Un-thinned Samples: 32000
Recommended Thinning: 29
Specs: (NOT SHOWN HERE)
Status is displayed every 2000 iterations
Summary1: (SHOWN BELOW)
Summary2: (SHOWN BELOW)
Thinned Samples: 800
Thinning: 50


Summary of All Samples
                Mean          SD       MCSE      ESS           LB      Median
beta[1]  -10.4528494  3.29890424 1.15497942 4.177730 -15.53397494 -10.6787420
beta[2]    0.2580348  0.08555002 0.02999169 4.298667   0.03214576   0.2624027
Deviance  47.7621473 10.02129764 3.27536855 6.286801  42.49978608  44.3599919
LP       -32.6868085  4.99380143 1.63132674 6.305356 -51.27646862 -30.9846439
                  UB
beta[1]   -1.7181514
beta[2]    0.3910341
Deviance  85.0579082
LP       -30.0606649


Summary of Stationary Samples
                Mean         SD        MCSE        ESS          LB      Median
beta[1]   -9.9877388 0.72505888 0.343663481   4.067397 -11.1603881 -10.0218190
beta[2]    0.2457418 0.01751842 0.008457087   4.670402   0.2134636   0.2455366
Deviance  44.0350516 1.27504093 0.152202189 111.800645  42.5446199  43.6141453
LP       -30.8133272 0.63519594 1.631326739 112.973295 -32.2755446 -30.6036070
                  UB
beta[1]   -8.5556519
beta[2]    0.2753276
Deviance  46.9684759
LP       -30.0758784

Updating Sequential Adaptive Metropolis-within-Gibbs 

For space-state models.

Updating Sequential Metropolis-within-Gibbs

For space-state models

No comments:

Post a Comment