Models

hbayesdm.models.alt_delta(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Aversive Learning Task - Rescorla-Wagner (Delta) Model

Hierarchical Bayesian Modeling of the Aversive Learning Task [Browning2015] using Rescorla-Wagner (Delta) Model with the following parameters: “A” (learning rate), “beta” (inverse temperature), “gamma” (risk preference).

Browning2015

Browning, M., Behrens, T. E., Jocham, G., O’reilly, J. X., & Bishop, S. J. (2015). Anxious individuals have difficulty learning the causal statistics of aversive environments. Nature neuroscience, 18(4), 590.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Aversive Learning Task, there should be 5 columns of data with the labels “subjID”, “choice”, “outcome”, “bluePunish”, “orangePunish”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial (blue == 1, orange == 2).

  • “outcome”: Integer value representing the outcome of the given trial (punishment == 1, and non-punishment == 0).

  • “bluePunish”: Floating point value representing the magnitude of punishment for blue on that trial (e.g., 10, 97)

  • “orangePunish”: Floating point value representing the magnitude of punishment for orange on that trial (e.g., 23, 45)

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”, “bluePunish”, “orangePunish”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘alt_delta’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import alt_delta

# Run the model and store results in "output"
output = alt_delta(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.alt_gamma(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Aversive Learning Task - Rescorla-Wagner (Gamma) Model

Hierarchical Bayesian Modeling of the Aversive Learning Task [Browning2015] using Rescorla-Wagner (Gamma) Model with the following parameters: “A” (learning rate), “beta” (inverse temperature), “gamma” (risk preference).

Browning2015

Browning, M., Behrens, T. E., Jocham, G., O’reilly, J. X., & Bishop, S. J. (2015). Anxious individuals have difficulty learning the causal statistics of aversive environments. Nature neuroscience, 18(4), 590.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Aversive Learning Task, there should be 5 columns of data with the labels “subjID”, “choice”, “outcome”, “bluePunish”, “orangePunish”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial (blue == 1, orange == 2).

  • “outcome”: Integer value representing the outcome of the given trial (punishment == 1, and non-punishment == 0).

  • “bluePunish”: Floating point value representing the magnitude of punishment for blue on that trial (e.g., 10, 97)

  • “orangePunish”: Floating point value representing the magnitude of punishment for orange on that trial (e.g., 23, 45)

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”, “bluePunish”, “orangePunish”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘alt_gamma’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import alt_gamma

# Run the model and store results in "output"
output = alt_gamma(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit2arm_delta(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

2-Armed Bandit Task - Rescorla-Wagner (Delta) Model

Hierarchical Bayesian Modeling of the 2-Armed Bandit Task [Erev2010], [Hertwig2004] using Rescorla-Wagner (Delta) Model with the following parameters: “A” (learning rate), “tau” (inverse temperature).

Erev2010

Erev, I., Ert, E., Roth, A. E., Haruvy, E., Herzog, S. M., Hau, R., et al. (2010). A choice prediction competition: Choices from experience and from description. Journal of Behavioral Decision Making, 23(1), 15-47. https://doi.org/10.1002/bdm.683

Hertwig2004

Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions From Experience and the Effect of Rare Events in Risky Choice. Psychological Science, 15(8), 534-539. https://doi.org/10.1111/j.0956-7976.2004.00715.x

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 2-Armed Bandit Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of the given trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit2arm_delta’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit2arm_delta

# Run the model and store results in "output"
output = bandit2arm_delta(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm2_kalman_filter(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task (modified) - Kalman Filter

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task (modified) using Kalman Filter [Daw2006] with the following parameters: “lambda” (decay factor), “theta” (decay center), “beta” (inverse softmax temperature), “mu0” (anticipated initial mean of all 4 options), “s0” (anticipated initial sd (uncertainty factor) of all 4 options), “sD” (sd of diffusion noise).

Daw2006

Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task (modified), there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “outcome”: Integer value representing the outcome of the given trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm2_kalman_filter’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm2_kalman_filter

# Run the model and store results in "output"
output = bandit4arm2_kalman_filter(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm_2par_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task - 3 Parameter Model, without C (choice perseveration), R (reward sensitivity), and P (punishment sensitivity). But with xi (noise)

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task using 3 Parameter Model, without C (choice perseveration), R (reward sensitivity), and P (punishment sensitivity). But with xi (noise) [Aylward2018] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “xi” (noise).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm_2par_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm_2par_lapse

# Run the model and store results in "output"
output = bandit4arm_2par_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm_4par(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task - 4 Parameter Model, without C (choice perseveration)

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task using 4 Parameter Model, without C (choice perseveration) [Seymour2012] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity).

Seymour2012

Seymour, Daw, Roiser, Dayan, & Dolan (2012). Serotonin Selectively Modulates Reward Value in Human Decision-Making. J Neuro, 32(17), 5833-5842.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm_4par’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm_4par

# Run the model and store results in "output"
output = bandit4arm_4par(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task - 5 Parameter Model, without C (choice perseveration) but with xi (noise)

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task using 5 Parameter Model, without C (choice perseveration) but with xi (noise) [Seymour2012] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise).

Seymour2012

Seymour, Daw, Roiser, Dayan, & Dolan (2012). Serotonin Selectively Modulates Reward Value in Human Decision-Making. J Neuro, 32(17), 5833-5842.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm_lapse

# Run the model and store results in "output"
output = bandit4arm_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm_lapse_decay(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task - 5 Parameter Model, without C (choice perseveration) but with xi (noise). Added decay rate (Niv et al., 2015, J. Neuro).

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task using 5 Parameter Model, without C (choice perseveration) but with xi (noise). Added decay rate (Niv et al., 2015, J. Neuro). [Aylward2018] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise), “d” (decay rate).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm_lapse_decay’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm_lapse_decay

# Run the model and store results in "output"
output = bandit4arm_lapse_decay(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bandit4arm_singleA_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

4-Armed Bandit Task - 4 Parameter Model, without C (choice perseveration) but with xi (noise). Single learning rate both for R and P.

Hierarchical Bayesian Modeling of the 4-Armed Bandit Task using 4 Parameter Model, without C (choice perseveration) but with xi (noise). Single learning rate both for R and P. [Aylward2018] with the following parameters: “A” (learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 4-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, or 4.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bandit4arm_singleA_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bandit4arm_singleA_lapse

# Run the model and store results in "output"
output = bandit4arm_singleA_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_2par_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - 3 Parameter Model, without C (choice perseveration), R (reward sensitivity), and P (punishment sensitivity). But with xi (noise)

Hierarchical Bayesian Modeling of the N-Armed Bandit Task using 3 Parameter Model, without C (choice perseveration), R (reward sensitivity), and P (punishment sensitivity). But with xi (noise) [Aylward2018] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “xi” (noise).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_2par_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_2par_lapse

# Run the model and store results in "output"
output = banditNarm_2par_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_4par(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - 4 Parameter Model, without C (choice perseveration)

Hierarchical Bayesian Modeling of the N-Armed Bandit Task using 4 Parameter Model, without C (choice perseveration) [Seymour2012] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity).

Seymour2012

Seymour, Daw, Roiser, Dayan, & Dolan (2012). Serotonin Selectively Modulates Reward Value in Human Decision-Making. J Neuro, 32(17), 5833-5842.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_4par’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_4par

# Run the model and store results in "output"
output = banditNarm_4par(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_delta(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - Rescorla-Wagner (Delta) Model

Hierarchical Bayesian Modeling of the N-Armed Bandit Task [Erev2010], [Hertwig2004] using Rescorla-Wagner (Delta) Model with the following parameters: “A” (learning rate), “tau” (inverse temperature).

Erev2010

Erev, I., Ert, E., Roth, A. E., Haruvy, E., Herzog, S. M., Hau, R., et al. (2010). A choice prediction competition: Choices from experience and from description. Journal of Behavioral Decision Making, 23(1), 15-47. https://doi.org/10.1002/bdm.683

Hertwig2004

Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions From Experience and the Effect of Rare Events in Risky Choice. Psychological Science, 15(8), 534-539. https://doi.org/10.1111/j.0956-7976.2004.00715.x

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_delta’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_delta

# Run the model and store results in "output"
output = banditNarm_delta(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_kalman_filter(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task (modified) - Kalman Filter

Hierarchical Bayesian Modeling of the N-Armed Bandit Task (modified) using Kalman Filter [Daw2006] with the following parameters: “lambda” (decay factor), “theta” (decay center), “beta” (inverse softmax temperature), “mu0” (anticipated initial mean of all 4 options), “s0” (anticipated initial sd (uncertainty factor) of all 4 options), “sD” (sd of diffusion noise).

Daw2006

Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task (modified), there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_kalman_filter’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_kalman_filter

# Run the model and store results in "output"
output = banditNarm_kalman_filter(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - 5 Parameter Model, without C (choice perseveration) but with xi (noise)

Hierarchical Bayesian Modeling of the N-Armed Bandit Task using 5 Parameter Model, without C (choice perseveration) but with xi (noise) [Seymour2012] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise).

Seymour2012

Seymour, Daw, Roiser, Dayan, & Dolan (2012). Serotonin Selectively Modulates Reward Value in Human Decision-Making. J Neuro, 32(17), 5833-5842.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_lapse

# Run the model and store results in "output"
output = banditNarm_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_lapse_decay(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - 5 Parameter Model, without C (choice perseveration) but with xi (noise). Added decay rate (Niv et al., 2015, J. Neuro).

Hierarchical Bayesian Modeling of the N-Armed Bandit Task using 5 Parameter Model, without C (choice perseveration) but with xi (noise). Added decay rate (Niv et al., 2015, J. Neuro). [Aylward2018] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise), “d” (decay rate).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_lapse_decay’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_lapse_decay

# Run the model and store results in "output"
output = banditNarm_lapse_decay(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.banditNarm_singleA_lapse(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

N-Armed Bandit Task - 4 Parameter Model, without C (choice perseveration) but with xi (noise). Single learning rate both for R and P.

Hierarchical Bayesian Modeling of the N-Armed Bandit Task using 4 Parameter Model, without C (choice perseveration) but with xi (noise). Single learning rate both for R and P. [Aylward2018] with the following parameters: “A” (learning rate), “R” (reward sensitivity), “P” (punishment sensitivity), “xi” (noise).

Aylward2018

Aylward, Valton, Ahn, Bond, Dayan, Roiser, & Robinson (2018) Altered decision-making under uncertainty in unmedicated mood and anxiety disorders. PsyArxiv. 10.31234/osf.io/k5b8m

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the N-Armed Bandit Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on the given trial: 1, 2, 3, … N.

  • “gain”: Floating point value representing the amount of currency won on the given trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on the given trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • Narm: Number of arms used in Multi-armed Bandit Task If not given, the number of unique choice will be used.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘banditNarm_singleA_lapse’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import banditNarm_singleA_lapse

# Run the model and store results in "output"
output = banditNarm_singleA_lapse(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bart_ewmv(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Balloon Analogue Risk Task - Exponential-Weight Mean-Variance Model

Hierarchical Bayesian Modeling of the Balloon Analogue Risk Task using Exponential-Weight Mean-Variance Model [Park2020] with the following parameters: “phi” (prior belief of burst), “eta” (updating exponent), “rho” (risk preference), “tau” (inverse temperature), “lambda” (loss aversion).

Park2020

Park, H., Yang, J., Vassileva, J., & Ahn, W. (2020). The Exponential-Weight Mean-Variance Model: A novel computational model for the Balloon Analogue Risk Task. https://doi.org/10.31234/osf.io/sdzj4

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Balloon Analogue Risk Task, there should be 3 columns of data with the labels “subjID”, “pumps”, “explosion”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “pumps”: The number of pumps.

  • “explosion”: 0: intact, 1: burst

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “pumps”, “explosion”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bart_ewmv’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bart_ewmv

# Run the model and store results in "output"
output = bart_ewmv(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.bart_par4(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Balloon Analogue Risk Task - Re-parameterized version of BART model with 4 parameters

Hierarchical Bayesian Modeling of the Balloon Analogue Risk Task using Re-parameterized version of BART model with 4 parameters [van_Ravenzwaaij2011] with the following parameters: “phi” (prior belief of balloon not bursting), “eta” (updating rate), “gam” (risk-taking parameter), “tau” (inverse temperature).

van_Ravenzwaaij2011

van Ravenzwaaij, D., Dutilh, G., & Wagenmakers, E. J. (2011). Cognitive model decomposition of the BART: Assessment and application. Journal of Mathematical Psychology, 55(1), 94-105.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Balloon Analogue Risk Task, there should be 3 columns of data with the labels “subjID”, “pumps”, “explosion”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “pumps”: The number of pumps.

  • “explosion”: 0: intact, 1: burst

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “pumps”, “explosion”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘bart_par4’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import bart_par4

# Run the model and store results in "output"
output = bart_par4(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.cgt_cm(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Cambridge Gambling Task - Cumulative Model

Hierarchical Bayesian Modeling of the Cambridge Gambling Task [Rogers1999] using Cumulative Model with the following parameters: “alpha” (probability distortion), “c” (color bias), “rho” (relative loss sensitivity), “beta” (discounting rate), “gamma” (choice sensitivity).

Rogers1999

Rogers, R. D., Everitt, B. J., Baldacchino, A., Blackshaw, A. J., Swainson, R., Wynne, K., Baker, N. B., Hunter, J., Carthy, T., London, M., Deakin, J. F. W., Sahakian, B. J., Robbins, T. W. (1999). Dissociable deficits in the decision-making cognition of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms. Neuropsychopharmacology, 20, 322–339.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Cambridge Gambling Task, there should be 7 columns of data with the labels “subjID”, “gamble_type”, “percentage_staked”, “trial_initial_points”, “assessment_stage”, “red_chosen”, “n_red_boxes”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “gamble_type”: Integer value representng whether the bets on the current trial were presented in descending (0) or ascending (1) order.

  • “percentage_staked”: Integer value representing the bet percentage (not proportion) selected on the current trial: 5, 25, 50, 75, or 95.

  • “trial_initial_points”: Floating point value representing the number of points that the subject has at the start of the current trial (e.g., 100, 150, etc.).

  • “assessment_stage”: Integer value representing whether the current trial is a practice trial (0) or a test trial (1). Only test trials are used for model fitting.

  • “red_chosen”: Integer value representing whether the red color was chosen (1) versus the blue color (0).

  • “n_red_boxes”: Integer value representing the number of red boxes shown on the current trial: 1, 2, 3,…, or 9.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “gamble_type”, “percentage_staked”, “trial_initial_points”, “assessment_stage”, “red_chosen”, “n_red_boxes”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “y_hat_col”, “y_hat_bet”, “bet_utils”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred(Currently not available.) Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘cgt_cm’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import cgt_cm

# Run the model and store results in "output"
output = cgt_cm(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.choiceRT_ddm(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Choice Reaction Time Task - Drift Diffusion Model

Hierarchical Bayesian Modeling of the Choice Reaction Time Task using Drift Diffusion Model [Ratcliff1978] with the following parameters: “alpha” (boundary separation), “beta” (bias), “delta” (drift rate), “tau” (non-decision time).

Note

Note that this implementation is NOT the full Drift Diffusion Model as described in Ratcliff (1978). This implementation estimates the drift rate, boundary separation, starting point, and non-decision time; but not the between- and within-trial variances in these parameters.

Note

Code for this model is based on codes/comments by Guido Biele, Joseph Burling, Andrew Ellis, and potential others @ Stan mailing.

Ratcliff1978

Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59-108. https://doi.org/10.1037/0033-295X.85.2.59

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Choice Reaction Time Task, there should be 3 columns of data with the labels “subjID”, “choice”, “RT”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Choice made for the current trial, coded as 1/2 to indicate lower/upper boundary or left/right choices (e.g., 1 1 1 2 1 2).

  • “RT”: Choice reaction time for the current trial, in seconds (e.g., 0.435 0.383 0.314 0.309, etc.).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “RT”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred(Currently not available.) Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • RTbound: Floating point value representing the lower bound (i.e., minimum allowed) reaction time. Defaults to 0.1 (100 milliseconds).

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘choiceRT_ddm’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import choiceRT_ddm

# Run the model and store results in "output"
output = choiceRT_ddm(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.choiceRT_ddm_single(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Choice Reaction Time Task - Drift Diffusion Model

Individual Bayesian Modeling of the Choice Reaction Time Task using Drift Diffusion Model [Ratcliff1978] with the following parameters: “alpha” (boundary separation), “beta” (bias), “delta” (drift rate), “tau” (non-decision time).

Note

Note that this implementation is NOT the full Drift Diffusion Model as described in Ratcliff (1978). This implementation estimates the drift rate, boundary separation, starting point, and non-decision time; but not the between- and within-trial variances in these parameters.

Note

Code for this model is based on codes/comments by Guido Biele, Joseph Burling, Andrew Ellis, and potential others @ Stan mailing.

Ratcliff1978

Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85(2), 59-108. https://doi.org/10.1037/0033-295X.85.2.59

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Choice Reaction Time Task, there should be 3 columns of data with the labels “subjID”, “choice”, “RT”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Choice made for the current trial, coded as 1/2 to indicate lower/upper boundary or left/right choices (e.g., 1 1 1 2 1 2).

  • “RT”: Choice reaction time for the current trial, in seconds (e.g., 0.435 0.383 0.314 0.309, etc.).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “RT”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred(Currently not available.) Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • RTbound: Floating point value representing the lower bound (i.e., minimum allowed) reaction time. Defaults to 0.1 (100 milliseconds).

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘choiceRT_ddm_single’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import choiceRT_ddm_single

# Run the model and store results in "output"
output = choiceRT_ddm_single(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.cra_exp(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Choice Under Risk and Ambiguity Task - Exponential Subjective Value Model

Hierarchical Bayesian Modeling of the Choice Under Risk and Ambiguity Task using Exponential Subjective Value Model [Hsu2005] with the following parameters: “alpha” (risk attitude), “beta” (ambiguity attitude), “gamma” (inverse temperature).

Hsu2005

Hsu, M., Bhatt, M., Adolphs, R., Tranel, D., & Camerer, C. F. (2005). Neural systems responding to degrees of uncertainty in human decision-making. Science, 310(5754), 1680-1683. https://doi.org/10.1126/science.1115327

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Choice Under Risk and Ambiguity Task, there should be 6 columns of data with the labels “subjID”, “prob”, “ambig”, “reward_var”, “reward_fix”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “prob”: Objective probability of the variable lottery.

  • “ambig”: Ambiguity level of the variable lottery (0 for risky lottery; greater than 0 for ambiguous lottery).

  • “reward_var”: Amount of reward in variable lottery. Assumed to be greater than zero.

  • “reward_fix”: Amount of reward in fixed lottery. Assumed to be greater than zero.

  • “choice”: If the variable lottery was selected, choice == 1; otherwise choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “prob”, “ambig”, “reward_var”, “reward_fix”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “sv”, “sv_fix”, “sv_var”, “p_var”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘cra_exp’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import cra_exp

# Run the model and store results in "output"
output = cra_exp(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.cra_linear(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Choice Under Risk and Ambiguity Task - Linear Subjective Value Model

Hierarchical Bayesian Modeling of the Choice Under Risk and Ambiguity Task using Linear Subjective Value Model [Levy2010] with the following parameters: “alpha” (risk attitude), “beta” (ambiguity attitude), “gamma” (inverse temperature).

Levy2010

Levy, I., Snell, J., Nelson, A. J., Rustichini, A., & Glimcher, P. W. (2010). Neural representation of subjective value under risk and ambiguity. Journal of Neurophysiology, 103(2), 1036-1047.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Choice Under Risk and Ambiguity Task, there should be 6 columns of data with the labels “subjID”, “prob”, “ambig”, “reward_var”, “reward_fix”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “prob”: Objective probability of the variable lottery.

  • “ambig”: Ambiguity level of the variable lottery (0 for risky lottery; greater than 0 for ambiguous lottery).

  • “reward_var”: Amount of reward in variable lottery. Assumed to be greater than zero.

  • “reward_fix”: Amount of reward in fixed lottery. Assumed to be greater than zero.

  • “choice”: If the variable lottery was selected, choice == 1; otherwise choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “prob”, “ambig”, “reward_var”, “reward_fix”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “sv”, “sv_fix”, “sv_var”, “p_var”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘cra_linear’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import cra_linear

# Run the model and store results in "output"
output = cra_linear(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dbdm_prob_weight(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Description Based Decison Making Task - Probability Weight Function

Hierarchical Bayesian Modeling of the Description Based Decison Making Task using Probability Weight Function [Erev2010], [Hertwig2004], [Jessup2008] with the following parameters: “tau” (probability weight function), “rho” (subject utility function), “lambda” (loss aversion parameter), “beta” (inverse softmax temperature).

Erev2010

Erev, I., Ert, E., Roth, A. E., Haruvy, E., Herzog, S. M., Hau, R., … & Lebiere, C. (2010). A choice prediction competition: Choices from experience and from description. Journal of Behavioral Decision Making, 23(1), 15-47.

Hertwig2004

Hertwig, R., Barron, G., Weber, E. U., & Erev, I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychological science, 15(8), 534-539.

Jessup2008

Jessup, R. K., Bishara, A. J., & Busemeyer, J. R. (2008). Feedback produces divergence from prospect theory in descriptive choice. Psychological Science, 19(10), 1015-1022.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Description Based Decison Making Task, there should be 8 columns of data with the labels “subjID”, “opt1hprob”, “opt2hprob”, “opt1hval”, “opt1lval”, “opt2hval”, “opt2lval”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “opt1hprob”: Possiblity of getting higher value of outcome(opt1hval) when choosing option 1.

  • “opt2hprob”: Possiblity of getting higher value of outcome(opt2hval) when choosing option 2.

  • “opt1hval”: Possible (with opt1hprob probability) outcome of option 1.

  • “opt1lval”: Possible (with (1 - opt1hprob) probability) outcome of option 1.

  • “opt2hval”: Possible (with opt2hprob probability) outcome of option 2.

  • “opt2lval”: Possible (with (1 - opt2hprob) probability) outcome of option 2.

  • “choice”: If option 1 was selected, choice == 1; else if option 2 was selected, choice == 2.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “opt1hprob”, “opt2hprob”, “opt1hval”, “opt1lval”, “opt2hval”, “opt2lval”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dbdm_prob_weight’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dbdm_prob_weight

# Run the model and store results in "output"
output = dbdm_prob_weight(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dd_cs(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Delay Discounting Task - Constant-Sensitivity (CS) Model

Hierarchical Bayesian Modeling of the Delay Discounting Task using Constant-Sensitivity (CS) Model [Ebert2007] with the following parameters: “r” (exponential discounting rate), “s” (impatience), “beta” (inverse temperature).

Ebert2007

Ebert, J. E. J., & Prelec, D. (2007). The Fragility of Time: Time-Insensitivity and Valuation of the Near and Far Future. Management Science. https://doi.org/10.1287/mnsc.1060.0671

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Delay Discounting Task, there should be 6 columns of data with the labels “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “delay_later”: An integer representing the delayed days for the later option (e.g. 1, 6, 28).

  • “amount_later”: A floating point number representing the amount for the later option (e.g. 10.5, 13.4, 30.9).

  • “delay_sooner”: An integer representing the delayed days for the sooner option (e.g. 0).

  • “amount_sooner”: A floating point number representing the amount for the sooner option (e.g. 10).

  • “choice”: If amount_later was selected, choice == 1; else if amount_sooner was selected, choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dd_cs’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dd_cs

# Run the model and store results in "output"
output = dd_cs(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dd_cs_single(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Delay Discounting Task - Constant-Sensitivity (CS) Model

Individual Bayesian Modeling of the Delay Discounting Task using Constant-Sensitivity (CS) Model [Ebert2007] with the following parameters: “r” (exponential discounting rate), “s” (impatience), “beta” (inverse temperature).

Ebert2007

Ebert, J. E. J., & Prelec, D. (2007). The Fragility of Time: Time-Insensitivity and Valuation of the Near and Far Future. Management Science. https://doi.org/10.1287/mnsc.1060.0671

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Delay Discounting Task, there should be 6 columns of data with the labels “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “delay_later”: An integer representing the delayed days for the later option (e.g. 1, 6, 28).

  • “amount_later”: A floating point number representing the amount for the later option (e.g. 10.5, 13.4, 30.9).

  • “delay_sooner”: An integer representing the delayed days for the sooner option (e.g. 0).

  • “amount_sooner”: A floating point number representing the amount for the sooner option (e.g. 10).

  • “choice”: If amount_later was selected, choice == 1; else if amount_sooner was selected, choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dd_cs_single’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dd_cs_single

# Run the model and store results in "output"
output = dd_cs_single(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dd_exp(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Delay Discounting Task - Exponential Model

Hierarchical Bayesian Modeling of the Delay Discounting Task using Exponential Model [Samuelson1937] with the following parameters: “r” (exponential discounting rate), “beta” (inverse temperature).

Samuelson1937

Samuelson, P. A. (1937). A Note on Measurement of Utility. The Review of Economic Studies, 4(2), 155. https://doi.org/10.2307/2967612

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Delay Discounting Task, there should be 6 columns of data with the labels “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “delay_later”: An integer representing the delayed days for the later option (e.g. 1, 6, 28).

  • “amount_later”: A floating point number representing the amount for the later option (e.g. 10.5, 13.4, 30.9).

  • “delay_sooner”: An integer representing the delayed days for the sooner option (e.g. 0).

  • “amount_sooner”: A floating point number representing the amount for the sooner option (e.g. 10).

  • “choice”: If amount_later was selected, choice == 1; else if amount_sooner was selected, choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dd_exp’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dd_exp

# Run the model and store results in "output"
output = dd_exp(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dd_hyperbolic(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Delay Discounting Task - Hyperbolic Model

Hierarchical Bayesian Modeling of the Delay Discounting Task using Hyperbolic Model [Mazur1987] with the following parameters: “k” (discounting rate), “beta” (inverse temperature).

Mazur1987

Mazur, J. E. (1987). An adjustment procedure for studying delayed reinforcement.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Delay Discounting Task, there should be 6 columns of data with the labels “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “delay_later”: An integer representing the delayed days for the later option (e.g. 1, 6, 28).

  • “amount_later”: A floating point number representing the amount for the later option (e.g. 10.5, 13.4, 30.9).

  • “delay_sooner”: An integer representing the delayed days for the sooner option (e.g. 0).

  • “amount_sooner”: A floating point number representing the amount for the sooner option (e.g. 10).

  • “choice”: If amount_later was selected, choice == 1; else if amount_sooner was selected, choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dd_hyperbolic’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dd_hyperbolic

# Run the model and store results in "output"
output = dd_hyperbolic(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.dd_hyperbolic_single(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Delay Discounting Task - Hyperbolic Model

Individual Bayesian Modeling of the Delay Discounting Task using Hyperbolic Model [Mazur1987] with the following parameters: “k” (discounting rate), “beta” (inverse temperature).

Mazur1987

Mazur, J. E. (1987). An adjustment procedure for studying delayed reinforcement.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Delay Discounting Task, there should be 6 columns of data with the labels “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “delay_later”: An integer representing the delayed days for the later option (e.g. 1, 6, 28).

  • “amount_later”: A floating point number representing the amount for the later option (e.g. 10.5, 13.4, 30.9).

  • “delay_sooner”: An integer representing the delayed days for the sooner option (e.g. 0).

  • “amount_sooner”: A floating point number representing the amount for the sooner option (e.g. 10).

  • “choice”: If amount_later was selected, choice == 1; else if amount_sooner was selected, choice == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “delay_later”, “amount_later”, “delay_sooner”, “amount_sooner”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘dd_hyperbolic_single’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import dd_hyperbolic_single

# Run the model and store results in "output"
output = dd_hyperbolic_single(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.gng_m1(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Orthogonalized Go/Nogo Task - RW + noise

Hierarchical Bayesian Modeling of the Orthogonalized Go/Nogo Task using RW + noise [Guitart-Masip2012] with the following parameters: “xi” (noise), “ep” (learning rate), “rho” (effective size).

Guitart-Masip2012

Guitart-Masip, M., Huys, Q. J. M., Fuentemilla, L., Dayan, P., Duzel, E., & Dolan, R. J. (2012). Go and no-go learning in reward and punishment: Interactions between affect and effect. Neuroimage, 62(1), 154-166. https://doi.org/10.1016/j.neuroimage.2012.04.024

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Orthogonalized Go/Nogo Task, there should be 4 columns of data with the labels “subjID”, “cue”, “keyPressed”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cue”: Nominal integer representing the cue shown for that trial: 1, 2, 3, or 4.

  • “keyPressed”: Binary value representing the subject’s response for that trial (where Press == 1; No press == 0).

  • “outcome”: Ternary value representing the outcome of that trial (where Positive feedback == 1; Neutral feedback == 0; Negative feedback == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cue”, “keyPressed”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Qgo”, “Qnogo”, “Wgo”, “Wnogo”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘gng_m1’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import gng_m1

# Run the model and store results in "output"
output = gng_m1(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.gng_m2(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Orthogonalized Go/Nogo Task - RW + noise + bias

Hierarchical Bayesian Modeling of the Orthogonalized Go/Nogo Task using RW + noise + bias [Guitart-Masip2012] with the following parameters: “xi” (noise), “ep” (learning rate), “b” (action bias), “rho” (effective size).

Guitart-Masip2012

Guitart-Masip, M., Huys, Q. J. M., Fuentemilla, L., Dayan, P., Duzel, E., & Dolan, R. J. (2012). Go and no-go learning in reward and punishment: Interactions between affect and effect. Neuroimage, 62(1), 154-166. https://doi.org/10.1016/j.neuroimage.2012.04.024

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Orthogonalized Go/Nogo Task, there should be 4 columns of data with the labels “subjID”, “cue”, “keyPressed”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cue”: Nominal integer representing the cue shown for that trial: 1, 2, 3, or 4.

  • “keyPressed”: Binary value representing the subject’s response for that trial (where Press == 1; No press == 0).

  • “outcome”: Ternary value representing the outcome of that trial (where Positive feedback == 1; Neutral feedback == 0; Negative feedback == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cue”, “keyPressed”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Qgo”, “Qnogo”, “Wgo”, “Wnogo”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘gng_m2’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import gng_m2

# Run the model and store results in "output"
output = gng_m2(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.gng_m3(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Orthogonalized Go/Nogo Task - RW + noise + bias + pi

Hierarchical Bayesian Modeling of the Orthogonalized Go/Nogo Task using RW + noise + bias + pi [Guitart-Masip2012] with the following parameters: “xi” (noise), “ep” (learning rate), “b” (action bias), “pi” (Pavlovian bias), “rho” (effective size).

Guitart-Masip2012

Guitart-Masip, M., Huys, Q. J. M., Fuentemilla, L., Dayan, P., Duzel, E., & Dolan, R. J. (2012). Go and no-go learning in reward and punishment: Interactions between affect and effect. Neuroimage, 62(1), 154-166. https://doi.org/10.1016/j.neuroimage.2012.04.024

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Orthogonalized Go/Nogo Task, there should be 4 columns of data with the labels “subjID”, “cue”, “keyPressed”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cue”: Nominal integer representing the cue shown for that trial: 1, 2, 3, or 4.

  • “keyPressed”: Binary value representing the subject’s response for that trial (where Press == 1; No press == 0).

  • “outcome”: Ternary value representing the outcome of that trial (where Positive feedback == 1; Neutral feedback == 0; Negative feedback == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cue”, “keyPressed”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Qgo”, “Qnogo”, “Wgo”, “Wnogo”, “SV”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘gng_m3’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import gng_m3

# Run the model and store results in "output"
output = gng_m3(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.gng_m4(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Orthogonalized Go/Nogo Task - RW (rew/pun) + noise + bias + pi

Hierarchical Bayesian Modeling of the Orthogonalized Go/Nogo Task using RW (rew/pun) + noise + bias + pi [Cavanagh2013] with the following parameters: “xi” (noise), “ep” (learning rate), “b” (action bias), “pi” (Pavlovian bias), “rhoRew” (reward sensitivity), “rhoPun” (punishment sensitivity).

Cavanagh2013

Cavanagh, J. F., Eisenberg, I., Guitart-Masip, M., Huys, Q., & Frank, M. J. (2013). Frontal Theta Overrides Pavlovian Learning Biases. Journal of Neuroscience, 33(19), 8541-8548. https://doi.org/10.1523/JNEUROSCI.5754-12.2013

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Orthogonalized Go/Nogo Task, there should be 4 columns of data with the labels “subjID”, “cue”, “keyPressed”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cue”: Nominal integer representing the cue shown for that trial: 1, 2, 3, or 4.

  • “keyPressed”: Binary value representing the subject’s response for that trial (where Press == 1; No press == 0).

  • “outcome”: Ternary value representing the outcome of that trial (where Positive feedback == 1; Neutral feedback == 0; Negative feedback == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cue”, “keyPressed”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Qgo”, “Qnogo”, “Wgo”, “Wnogo”, “SV”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘gng_m4’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import gng_m4

# Run the model and store results in "output"
output = gng_m4(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.igt_orl(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Iowa Gambling Task - Outcome-Representation Learning Model

Hierarchical Bayesian Modeling of the Iowa Gambling Task [Ahn2008] using Outcome-Representation Learning Model [Haines2018] with the following parameters: “Arew” (reward learning rate), “Apun” (punishment learning rate), “K” (perseverance decay), “betaF” (outcome frequency weight), “betaP” (perseverance weight).

Ahn2008

Ahn, W. Y., Busemeyer, J. R., & Wagenmakers, E. J. (2008). Comparison of decision learning models using the generalization criterion method. Cognitive Science, 32(8), 1376-1402. https://doi.org/10.1080/03640210802352992

Haines2018

Haines, N., Vassileva, J., & Ahn, W.-Y. (2018). The Outcome-Representation Learning Model: A Novel Reinforcement Learning Model of the Iowa Gambling Task. Cognitive Science. https://doi.org/10.1111/cogs.12688

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Iowa Gambling Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer indicating which deck was chosen on that trial (where A==1, B==2, C==3, and D==4).

  • “gain”: Floating point value representing the amount of currency won on that trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on that trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • payscale: Raw payoffs within data are divided by this number. Used for scaling data. Defaults to 100.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘igt_orl’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import igt_orl

# Run the model and store results in "output"
output = igt_orl(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.igt_pvl_decay(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Iowa Gambling Task - Prospect Valence Learning (PVL) Decay-RI

Hierarchical Bayesian Modeling of the Iowa Gambling Task [Ahn2008] using Prospect Valence Learning (PVL) Decay-RI [Ahn2014] with the following parameters: “A” (decay rate), “alpha” (outcome sensitivity), “cons” (response consistency), “lambda” (loss aversion).

Ahn2008

Ahn, W. Y., Busemeyer, J. R., & Wagenmakers, E. J. (2008). Comparison of decision learning models using the generalization criterion method. Cognitive Science, 32(8), 1376-1402. https://doi.org/10.1080/03640210802352992

Ahn2014

Ahn, W.-Y., Vasilev, G., Lee, S.-H., Busemeyer, J. R., Kruschke, J. K., Bechara, A., & Vassileva, J. (2014). Decision-making in stimulant and opiate addicts in protracted abstinence: evidence from computational modeling with pure users. Frontiers in Psychology, 5, 1376. https://doi.org/10.3389/fpsyg.2014.00849

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Iowa Gambling Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer indicating which deck was chosen on that trial (where A==1, B==2, C==3, and D==4).

  • “gain”: Floating point value representing the amount of currency won on that trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on that trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • payscale: Raw payoffs within data are divided by this number. Used for scaling data. Defaults to 100.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘igt_pvl_decay’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import igt_pvl_decay

# Run the model and store results in "output"
output = igt_pvl_decay(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.igt_pvl_delta(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Iowa Gambling Task - Prospect Valence Learning (PVL) Delta

Hierarchical Bayesian Modeling of the Iowa Gambling Task [Ahn2008] using Prospect Valence Learning (PVL) Delta [Ahn2008] with the following parameters: “A” (learning rate), “alpha” (outcome sensitivity), “cons” (response consistency), “lambda” (loss aversion).

Ahn2008

Ahn, W. Y., Busemeyer, J. R., & Wagenmakers, E. J. (2008). Comparison of decision learning models using the generalization criterion method. Cognitive Science, 32(8), 1376-1402. https://doi.org/10.1080/03640210802352992

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Iowa Gambling Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer indicating which deck was chosen on that trial (where A==1, B==2, C==3, and D==4).

  • “gain”: Floating point value representing the amount of currency won on that trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on that trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • payscale: Raw payoffs within data are divided by this number. Used for scaling data. Defaults to 100.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘igt_pvl_delta’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import igt_pvl_delta

# Run the model and store results in "output"
output = igt_pvl_delta(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.igt_vpp(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Iowa Gambling Task - Value-Plus-Perseverance

Hierarchical Bayesian Modeling of the Iowa Gambling Task [Ahn2008] using Value-Plus-Perseverance [Worthy2013] with the following parameters: “A” (learning rate), “alpha” (outcome sensitivity), “cons” (response consistency), “lambda” (loss aversion), “epP” (gain impact), “epN” (loss impact), “K” (decay rate), “w” (RL weight).

Ahn2008

Ahn, W. Y., Busemeyer, J. R., & Wagenmakers, E. J. (2008). Comparison of decision learning models using the generalization criterion method. Cognitive Science, 32(8), 1376-1402. https://doi.org/10.1080/03640210802352992

Worthy2013

Worthy, D. A., & Todd Maddox, W. (2013). A comparison model of reinforcement-learning and win-stay-lose-shift decision-making processes: A tribute to W.K. Estes. Journal of Mathematical Psychology, 59, 41-49. https://doi.org/10.1016/j.jmp.2013.10.001

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Iowa Gambling Task, there should be 4 columns of data with the labels “subjID”, “choice”, “gain”, “loss”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer indicating which deck was chosen on that trial (where A==1, B==2, C==3, and D==4).

  • “gain”: Floating point value representing the amount of currency won on that trial (e.g. 50, 100).

  • “loss”: Floating point value representing the amount of currency lost on that trial (e.g. 0, -50).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “gain”, “loss”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • payscale: Raw payoffs within data are divided by this number. Used for scaling data. Defaults to 100.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘igt_vpp’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import igt_vpp

# Run the model and store results in "output"
output = igt_vpp(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.peer_ocu(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Peer Influence Task - Other-Conferred Utility (OCU) Model

Hierarchical Bayesian Modeling of the Peer Influence Task [Chung2015] using Other-Conferred Utility (OCU) Model with the following parameters: “rho” (risk preference), “tau” (inverse temperature), “ocu” (other-conferred utility).

Chung2015

Chung, D., Christopoulos, G. I., King-Casas, B., Ball, S. B., & Chiu, P. H. (2015). Social signals of safety and risk confer utility and have asymmetric effects on observers’ choices. Nature Neuroscience, 18(6), 912-916.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Peer Influence Task, there should be 8 columns of data with the labels “subjID”, “condition”, “p_gamble”, “safe_Hpayoff”, “safe_Lpayoff”, “risky_Hpayoff”, “risky_Lpayoff”, “choice”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “condition”: 0: solo, 1: info (safe/safe), 2: info (mix), 3: info (risky/risky).

  • “p_gamble”: Probability of receiving a high payoff (same for both options).

  • “safe_Hpayoff”: High payoff of the safe option.

  • “safe_Lpayoff”: Low payoff of the safe option.

  • “risky_Hpayoff”: High payoff of the risky option.

  • “risky_Lpayoff”: Low payoff of the risky option.

  • “choice”: Which option was chosen? 0: safe, 1: risky.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “condition”, “p_gamble”, “safe_Hpayoff”, “safe_Lpayoff”, “risky_Hpayoff”, “risky_Lpayoff”, “choice”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘peer_ocu’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import peer_ocu

# Run the model and store results in "output"
output = peer_ocu(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_ewa(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Experience-Weighted Attraction Model

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Experience-Weighted Attraction Model [Ouden2013] with the following parameters: “phi” (1 - learning rate), “rho” (experience decay factor), “beta” (inverse temperature).

Ouden2013

Ouden, den, H. E. M., Daw, N. D., Fernandez, G., Elshout, J. A., Rijpkema, M., Hoogman, M., et al. (2013). Dissociable Effects of Dopamine and Serotonin on Reversal Learning. Neuron, 80(4), 1090-1100. https://doi.org/10.1016/j.neuron.2013.08.030

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “ew_c”, “ew_nc”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_ewa’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_ewa

# Run the model and store results in "output"
output = prl_ewa(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_fictitious(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Fictitious Update Model

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Fictitious Update Model [Glascher2009] with the following parameters: “eta” (learning rate), “alpha” (indecision point), “beta” (inverse temperature).

Glascher2009

Glascher, J., Hampton, A. N., & O’Doherty, J. P. (2009). Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making. Cerebral Cortex, 19(2), 483-495. https://doi.org/10.1093/cercor/bhn098

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe_c”, “pe_nc”, “dv”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_fictitious’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_fictitious

# Run the model and store results in "output"
output = prl_fictitious(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_fictitious_multipleB(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Fictitious Update Model

Multiple-Block Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Fictitious Update Model [Glascher2009] with the following parameters: “eta” (learning rate), “alpha” (indecision point), “beta” (inverse temperature).

Glascher2009

Glascher, J., Hampton, A. N., & O’Doherty, J. P. (2009). Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making. Cerebral Cortex, 19(2), 483-495. https://doi.org/10.1093/cercor/bhn098

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 4 columns of data with the labels “subjID”, “block”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “block”: A unique identifier for each of the multiple blocks within each subject.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “block”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe_c”, “pe_nc”, “dv”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_fictitious_multipleB’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_fictitious_multipleB

# Run the model and store results in "output"
output = prl_fictitious_multipleB(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_fictitious_rp(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Fictitious Update Model, with separate learning rates for positive and negative prediction error (PE)

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Fictitious Update Model, with separate learning rates for positive and negative prediction error (PE) [Glascher2009], [Ouden2013] with the following parameters: “eta_pos” (learning rate, +PE), “eta_neg” (learning rate, -PE), “alpha” (indecision point), “beta” (inverse temperature).

Glascher2009

Glascher, J., Hampton, A. N., & O’Doherty, J. P. (2009). Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making. Cerebral Cortex, 19(2), 483-495. https://doi.org/10.1093/cercor/bhn098

Ouden2013

Ouden, den, H. E. M., Daw, N. D., Fernandez, G., Elshout, J. A., Rijpkema, M., Hoogman, M., et al. (2013). Dissociable Effects of Dopamine and Serotonin on Reversal Learning. Neuron, 80(4), 1090-1100. https://doi.org/10.1016/j.neuron.2013.08.030

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe_c”, “pe_nc”, “dv”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_fictitious_rp’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_fictitious_rp

# Run the model and store results in "output"
output = prl_fictitious_rp(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_fictitious_rp_woa(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Fictitious Update Model, with separate learning rates for positive and negative prediction error (PE), without alpha (indecision point)

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Fictitious Update Model, with separate learning rates for positive and negative prediction error (PE), without alpha (indecision point) [Glascher2009], [Ouden2013] with the following parameters: “eta_pos” (learning rate, +PE), “eta_neg” (learning rate, -PE), “beta” (inverse temperature).

Glascher2009

Glascher, J., Hampton, A. N., & O’Doherty, J. P. (2009). Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making. Cerebral Cortex, 19(2), 483-495. https://doi.org/10.1093/cercor/bhn098

Ouden2013

Ouden, den, H. E. M., Daw, N. D., Fernandez, G., Elshout, J. A., Rijpkema, M., Hoogman, M., et al. (2013). Dissociable Effects of Dopamine and Serotonin on Reversal Learning. Neuron, 80(4), 1090-1100. https://doi.org/10.1016/j.neuron.2013.08.030

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe_c”, “pe_nc”, “dv”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_fictitious_rp_woa’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_fictitious_rp_woa

# Run the model and store results in "output"
output = prl_fictitious_rp_woa(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_fictitious_woa(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Fictitious Update Model, without alpha (indecision point)

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Fictitious Update Model, without alpha (indecision point) [Glascher2009] with the following parameters: “eta” (learning rate), “beta” (inverse temperature).

Glascher2009

Glascher, J., Hampton, A. N., & O’Doherty, J. P. (2009). Determining a Role for Ventromedial Prefrontal Cortex in Encoding Action-Based Value Signals During Reward-Related Decision Making. Cerebral Cortex, 19(2), 483-495. https://doi.org/10.1093/cercor/bhn098

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe_c”, “pe_nc”, “dv”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_fictitious_woa’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_fictitious_woa

# Run the model and store results in "output"
output = prl_fictitious_woa(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_rp(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Reward-Punishment Model

Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Reward-Punishment Model [Ouden2013] with the following parameters: “Apun” (punishment learning rate), “Arew” (reward learning rate), “beta” (inverse temperature).

Ouden2013

Ouden, den, H. E. M., Daw, N. D., Fernandez, G., Elshout, J. A., Rijpkema, M., Hoogman, M., et al. (2013). Dissociable Effects of Dopamine and Serotonin on Reversal Learning. Neuron, 80(4), 1090-1100. https://doi.org/10.1016/j.neuron.2013.08.030

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_rp’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_rp

# Run the model and store results in "output"
output = prl_rp(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.prl_rp_multipleB(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Reversal Learning Task - Reward-Punishment Model

Multiple-Block Hierarchical Bayesian Modeling of the Probabilistic Reversal Learning Task using Reward-Punishment Model [Ouden2013] with the following parameters: “Apun” (punishment learning rate), “Arew” (reward learning rate), “beta” (inverse temperature).

Ouden2013

Ouden, den, H. E. M., Daw, N. D., Fernandez, G., Elshout, J. A., Rijpkema, M., Hoogman, M., et al. (2013). Dissociable Effects of Dopamine and Serotonin on Reversal Learning. Neuron, 80(4), 1090-1100. https://doi.org/10.1016/j.neuron.2013.08.030

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Reversal Learning Task, there should be 4 columns of data with the labels “subjID”, “block”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “block”: A unique identifier for each of the multiple blocks within each subject.

  • “choice”: Integer value representing the option chosen on that trial: 1 or 2.

  • “outcome”: Integer value representing the outcome of that trial (where reward == 1, and loss == -1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “block”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “ev_c”, “ev_nc”, “pe”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘prl_rp_multipleB’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import prl_rp_multipleB

# Run the model and store results in "output"
output = prl_rp_multipleB(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.pstRT_ddm(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Selection Task (with RT data) - Drift Diffusion Model

Hierarchical Bayesian Modeling of the Probabilistic Selection Task (with RT data) [Frank2007], [Frank2004] using Drift Diffusion Model [Pedersen2017] with the following parameters: “a” (boundary separation), “tau” (non-decision time), “d1” (drift rate scaling), “d2” (drift rate scaling), “d3” (drift rate scaling).

Frank2007

Frank, M. J., Santamaria, A., O’Reilly, R. C., & Willcutt, E. (2007). Testing computational models of dopamine and noradrenaline dysfunction in attention deficit/hyperactivity disorder. Neuropsychopharmacology, 32(7), 1583-1599.

Frank2004

Frank, M. J., Seeberger, L. C., & O’reilly, R. C. (2004). By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science, 306(5703), 1940-1943.

Pedersen2017

Pedersen, M. L., Frank, M. J., & Biele, G. (2017). The drift diffusion model as the choice rule in reinforcement learning. Psychonomic bulletin & review, 24(4), 1234-1251.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Selection Task (with RT data), there should be 4 columns of data with the labels “subjID”, “cond”, “choice”, “RT”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cond”: Integer value representing the task condition of the given trial (AB == 1, CD == 2, EF == 3).

  • “choice”: Integer value representing the option chosen on the given trial (1 or 2).

  • “RT”: Float value representing the time taken for the response on the given trial.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cond”, “choice”, “RT”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • RTbound: Floating point value representing the lower bound (i.e., minimum allowed) reaction time. Defaults to 0.1 (100 milliseconds).

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘pstRT_ddm’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import pstRT_ddm

# Run the model and store results in "output"
output = pstRT_ddm(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.pstRT_rlddm1(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Selection Task (with RT data) - Reinforcement Learning Drift Diffusion Model 1

Hierarchical Bayesian Modeling of the Probabilistic Selection Task (with RT data) [Frank2007], [Frank2004] using Reinforcement Learning Drift Diffusion Model 1 [Pedersen2017] with the following parameters: “a” (boundary separation), “tau” (non-decision time), “v” (drift rate scaling), “alpha” (learning rate).

Frank2007

Frank, M. J., Santamaria, A., O’Reilly, R. C., & Willcutt, E. (2007). Testing computational models of dopamine and noradrenaline dysfunction in attention deficit/hyperactivity disorder. Neuropsychopharmacology, 32(7), 1583-1599.

Frank2004

Frank, M. J., Seeberger, L. C., & O’reilly, R. C. (2004). By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science, 306(5703), 1940-1943.

Pedersen2017

Pedersen, M. L., Frank, M. J., & Biele, G. (2017). The drift diffusion model as the choice rule in reinforcement learning. Psychonomic bulletin & review, 24(4), 1234-1251.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Selection Task (with RT data), there should be 6 columns of data with the labels “subjID”, “cond”, “prob”, “choice”, “RT”, “feedback”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “cond”: Integer value representing the task condition of the given trial (AB == 1, CD == 2, EF == 3).

  • “prob”: Float value representing the probability that a correct response (1) is rewarded in the current task condition.

  • “choice”: Integer value representing the option chosen on the given trial (1 or 2).

  • “RT”: Float value representing the time taken for the response on the given trial.

  • “feedback”: Integer value representing the outcome of the given trial (where ‘correct’ == 1, and ‘incorrect’ == 0).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “cond”, “prob”, “choice”, “RT”, “feedback”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Q1”, “Q2”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • RTbound: Floating point value representing the lower bound (i.e., minimum allowed) reaction time. Defaults to 0.1 (100 milliseconds).

    • initQ: Floating point value representing the model’s initial value of any choice.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘pstRT_rlddm1’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import pstRT_rlddm1

# Run the model and store results in "output"
output = pstRT_rlddm1(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.pstRT_rlddm6(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Selection Task (with RT data) - Reinforcement Learning Drift Diffusion Model 6

Hierarchical Bayesian Modeling of the Probabilistic Selection Task (with RT data) [Frank2007], [Frank2004] using Reinforcement Learning Drift Diffusion Model 6 [Pedersen2017] with the following parameters: “a” (boundary separation), “bp” (boundary separation power), “tau” (non-decision time), “v” (drift rate scaling), “alpha_pos” (learning rate for positive prediction error), “alpha_neg” (learning rate for negative prediction error).

Frank2007

Frank, M. J., Santamaria, A., O’Reilly, R. C., & Willcutt, E. (2007). Testing computational models of dopamine and noradrenaline dysfunction in attention deficit/hyperactivity disorder. Neuropsychopharmacology, 32(7), 1583-1599.

Frank2004

Frank, M. J., Seeberger, L. C., & O’reilly, R. C. (2004). By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science, 306(5703), 1940-1943.

Pedersen2017

Pedersen, M. L., Frank, M. J., & Biele, G. (2017). The drift diffusion model as the choice rule in reinforcement learning. Psychonomic bulletin & review, 24(4), 1234-1251.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Selection Task (with RT data), there should be 7 columns of data with the labels “subjID”, “iter”, “cond”, “prob”, “choice”, “RT”, “feedback”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “iter”: Integer value representing the trial number for each task condition.

  • “cond”: Integer value representing the task condition of the given trial (AB == 1, CD == 2, EF == 3).

  • “prob”: Float value representing the probability that a correct response (1) is rewarded in the current task condition.

  • “choice”: Integer value representing the option chosen on the given trial (1 or 2).

  • “RT”: Float value representing the time taken for the response on the given trial.

  • “feedback”: Integer value representing the outcome of the given trial (where ‘correct’ == 1, and ‘incorrect’ == 0).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “iter”, “cond”, “prob”, “choice”, “RT”, “feedback”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. For this model they are: “Q1”, “Q2”.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • RTbound: Floating point value representing the lower bound (i.e., minimum allowed) reaction time. Defaults to 0.1 (100 milliseconds).

    • initQ: Floating point value representing the model’s initial value of any choice.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘pstRT_rlddm6’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

  • model_regressor: Dict holding the extracted model-based regressors.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import pstRT_rlddm6

# Run the model and store results in "output"
output = pstRT_rlddm6(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.pst_Q(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Selection Task - Q Learning Model

Hierarchical Bayesian Modeling of the Probabilistic Selection Task using Q Learning Model [Frank2007] with the following parameters: “alpha” (learning rate), “beta” (inverse temperature).

Frank2007

Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T., & Hutchison, K. E. (2007). Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proceedings of the National Academy of Sciences, 104(41), 16311-16316.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Selection Task, there should be 4 columns of data with the labels “subjID”, “type”, “choice”, “reward”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “type”: Two-digit number indicating which pair of stimuli were presented for that trial, e.g. 12, 34, or 56. The digit on the left (tens-digit) indicates the presented stimulus for option1, while the digit on the right (ones-digit) indicates that for option2. Code for each stimulus type (1~6) is defined as for 80% (type 1), 20% (type 2), 70% (type 3), 30% (type 4), 60% (type 5), 40% (type 6). The modeling will still work even if different probabilities are used for the stimuli; however, the total number of stimuli should be less than or equal to 6.

  • “choice”: Whether the subject chose the left option (option1) out of the given two options (i.e. if option1 was chosen, 1; if option2 was chosen, 0).

  • “reward”: Amount of reward earned as a result of the trial.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “type”, “choice”, “reward”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘pst_Q’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import pst_Q

# Run the model and store results in "output"
output = pst_Q(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.pst_gainloss_Q(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Probabilistic Selection Task - Gain-Loss Q Learning Model

Hierarchical Bayesian Modeling of the Probabilistic Selection Task using Gain-Loss Q Learning Model [Frank2007] with the following parameters: “alpha_pos” (learning rate for positive feedbacks), “alpha_neg” (learning rate for negative feedbacks), “beta” (inverse temperature).

Frank2007

Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T., & Hutchison, K. E. (2007). Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proceedings of the National Academy of Sciences, 104(41), 16311-16316.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Probabilistic Selection Task, there should be 4 columns of data with the labels “subjID”, “type”, “choice”, “reward”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “type”: Two-digit number indicating which pair of stimuli were presented for that trial, e.g. 12, 34, or 56. The digit on the left (tens-digit) indicates the presented stimulus for option1, while the digit on the right (ones-digit) indicates that for option2. Code for each stimulus type (1~6) is defined as for 80% (type 1), 20% (type 2), 70% (type 3), 30% (type 4), 60% (type 5), 40% (type 6). The modeling will still work even if different probabilities are used for the stimuli; however, the total number of stimuli should be less than or equal to 6.

  • “choice”: Whether the subject chose the left option (option1) out of the given two options (i.e. if option1 was chosen, 1; if option2 was chosen, 0).

  • “reward”: Amount of reward earned as a result of the trial.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “type”, “choice”, “reward”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘pst_gainloss_Q’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import pst_gainloss_Q

# Run the model and store results in "output"
output = pst_gainloss_Q(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ra_noLA(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Risk Aversion Task - Prospect Theory, without loss aversion (LA) parameter

Hierarchical Bayesian Modeling of the Risk Aversion Task using Prospect Theory, without loss aversion (LA) parameter [Sokol-Hessner2009] with the following parameters: “rho” (risk aversion), “tau” (inverse temperature).

Sokol-Hessner2009

Sokol-Hessner, P., Hsu, M., Curley, N. G., Delgado, M. R., Camerer, C. F., Phelps, E. A., & Smith, E. E. (2009). Thinking like a Trader Selectively Reduces Individuals’ Loss Aversion. Proceedings of the National Academy of Sciences of the United States of America, 106(13), 5035-5040. https://www.pnas.org/content/106/13/5035

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Risk Aversion Task, there should be 5 columns of data with the labels “subjID”, “gain”, “loss”, “cert”, “gamble”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “gain”: Possible (50%) gain outcome of a risky option (e.g. 9).

  • “loss”: Possible (50%) loss outcome of a risky option (e.g. 5, or -5).

  • “cert”: Guaranteed amount of a safe option. “cert” is assumed to be zero or greater than zero.

  • “gamble”: If gamble was taken, gamble == 1; else gamble == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “gain”, “loss”, “cert”, “gamble”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ra_noLA’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ra_noLA

# Run the model and store results in "output"
output = ra_noLA(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ra_noRA(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Risk Aversion Task - Prospect Theory, without risk aversion (RA) parameter

Hierarchical Bayesian Modeling of the Risk Aversion Task using Prospect Theory, without risk aversion (RA) parameter [Sokol-Hessner2009] with the following parameters: “lambda” (loss aversion), “tau” (inverse temperature).

Sokol-Hessner2009

Sokol-Hessner, P., Hsu, M., Curley, N. G., Delgado, M. R., Camerer, C. F., Phelps, E. A., & Smith, E. E. (2009). Thinking like a Trader Selectively Reduces Individuals’ Loss Aversion. Proceedings of the National Academy of Sciences of the United States of America, 106(13), 5035-5040. https://www.pnas.org/content/106/13/5035

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Risk Aversion Task, there should be 5 columns of data with the labels “subjID”, “gain”, “loss”, “cert”, “gamble”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “gain”: Possible (50%) gain outcome of a risky option (e.g. 9).

  • “loss”: Possible (50%) loss outcome of a risky option (e.g. 5, or -5).

  • “cert”: Guaranteed amount of a safe option. “cert” is assumed to be zero or greater than zero.

  • “gamble”: If gamble was taken, gamble == 1; else gamble == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “gain”, “loss”, “cert”, “gamble”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ra_noRA’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ra_noRA

# Run the model and store results in "output"
output = ra_noRA(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ra_prospect(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Risk Aversion Task - Prospect Theory

Hierarchical Bayesian Modeling of the Risk Aversion Task using Prospect Theory [Sokol-Hessner2009] with the following parameters: “rho” (risk aversion), “lambda” (loss aversion), “tau” (inverse temperature).

Sokol-Hessner2009

Sokol-Hessner, P., Hsu, M., Curley, N. G., Delgado, M. R., Camerer, C. F., Phelps, E. A., & Smith, E. E. (2009). Thinking like a Trader Selectively Reduces Individuals’ Loss Aversion. Proceedings of the National Academy of Sciences of the United States of America, 106(13), 5035-5040. https://www.pnas.org/content/106/13/5035

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Risk Aversion Task, there should be 5 columns of data with the labels “subjID”, “gain”, “loss”, “cert”, “gamble”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “gain”: Possible (50%) gain outcome of a risky option (e.g. 9).

  • “loss”: Possible (50%) loss outcome of a risky option (e.g. 5, or -5).

  • “cert”: Guaranteed amount of a safe option. “cert” is assumed to be zero or greater than zero.

  • “gamble”: If gamble was taken, gamble == 1; else gamble == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “gain”, “loss”, “cert”, “gamble”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ra_prospect’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ra_prospect

# Run the model and store results in "output"
output = ra_prospect(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.rdt_happiness(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Risky Decision Task - Happiness Computational Model

Hierarchical Bayesian Modeling of the Risky Decision Task using Happiness Computational Model [Rutledge2014] with the following parameters: “w0” (baseline), “w1” (weight of certain rewards), “w2” (weight of expected values), “w3” (weight of reward prediction errors), “gam” (forgetting factor), “sig” (standard deviation of error).

Rutledge2014

Rutledge, R. B., Skandali, N., Dayan, P., & Dolan, R. J. (2014). A computational and neural model of momentary subjective well-being. Proceedings of the National Academy of Sciences, 111(33), 12252-12257.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Risky Decision Task, there should be 9 columns of data with the labels “subjID”, “gain”, “loss”, “cert”, “type”, “gamble”, “outcome”, “happy”, “RT_happy”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “gain”: Possible (50%) gain outcome of a risky option (e.g. 9).

  • “loss”: Possible (50%) loss outcome of a risky option (e.g. 5, or -5).

  • “cert”: Guaranteed amount of a safe option.

  • “type”: loss == -1, mixed == 0, gain == 1

  • “gamble”: If gamble was taken, gamble == 1; else gamble == 0.

  • “outcome”: Result of the trial.

  • “happy”: Happiness score.

  • “RT_happy”: Reaction time for answering the happiness score.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “gain”, “loss”, “cert”, “type”, “gamble”, “outcome”, “happy”, “RT_happy”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘rdt_happiness’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import rdt_happiness

# Run the model and store results in "output"
output = rdt_happiness(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.task2AFC_sdt(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

2-alternative forced choice task - Signal detection theory model

Hierarchical Bayesian Modeling of the 2-alternative forced choice task using Signal detection theory model with the following parameters: “d” (discriminability), “c” (decision bias (criteria)).

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the 2-alternative forced choice task, there should be 3 columns of data with the labels “subjID”, “stimulus”, “response”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “stimulus”: Types of Stimuli (Should be 1 or 0. 1 for Signal and 0 for Noise)

  • “response”: Types of Responses (It should be same format as the stimulus field. Should be 1 or 0)

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “stimulus”, “response”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘task2AFC_sdt’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import task2AFC_sdt

# Run the model and store results in "output"
output = task2AFC_sdt(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ts_par4(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Two-Step Task - Hybrid Model, with 4 parameters

Hierarchical Bayesian Modeling of the Two-Step Task [Daw2011] using Hybrid Model, with 4 parameters [Daw2011], [Wunderlich2012] with the following parameters: “a” (learning rate for both stages 1 & 2), “beta” (inverse temperature for both stages 1 & 2), “pi” (perseverance), “w” (model-based weight).

Daw2011

Daw, N. D., Gershman, S. J., Seymour, B., Ben Seymour, Dayan, P., & Dolan, R. J. (2011). Model-Based Influences on Humans’ Choices and Striatal Prediction Errors. Neuron, 69(6), 1204-1215. https://doi.org/10.1016/j.neuron.2011.02.027

Wunderlich2012

Wunderlich, K., Smittenaar, P., & Dolan, R. J. (2012). Dopamine enhances model-based over model-free choice behavior. Neuron, 75(3), 418-424.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Two-Step Task, there should be 4 columns of data with the labels “subjID”, “level1_choice”, “level2_choice”, “reward”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “level1_choice”: Choice made for Level (Stage) 1 (1: stimulus 1, 2: stimulus 2).

  • “level2_choice”: Choice made for Level (Stage) 2 (1: stimulus 3, 2: stimulus 4, 3: stimulus 5, 4: stimulus 6).

    Note that, in our notation, choosing stimulus 1 in Level 1 leads to stimulus 3 & 4 in Level 2 with a common (0.7 by default) transition. Similarly, choosing stimulus 2 in Level 1 leads to stimulus 5 & 6 in Level 2 with a common (0.7 by default) transition. To change this default transition probability, set the function argument trans_prob to your preferred value.

  • “reward”: Reward after Level 2 (0 or 1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “level1_choice”, “level2_choice”, “reward”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • trans_prob: Common state transition probability from Stage (Level) 1 to Stage (Level) 2. Defaults to 0.7.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ts_par4’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ts_par4

# Run the model and store results in "output"
output = ts_par4(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ts_par6(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Two-Step Task - Hybrid Model, with 6 parameters

Hierarchical Bayesian Modeling of the Two-Step Task [Daw2011] using Hybrid Model, with 6 parameters [Daw2011] with the following parameters: “a1” (learning rate in stage 1), “beta1” (inverse temperature in stage 1), “a2” (learning rate in stage 2), “beta2” (inverse temperature in stage 2), “pi” (perseverance), “w” (model-based weight).

Daw2011

Daw, N. D., Gershman, S. J., Seymour, B., Ben Seymour, Dayan, P., & Dolan, R. J. (2011). Model-Based Influences on Humans’ Choices and Striatal Prediction Errors. Neuron, 69(6), 1204-1215. https://doi.org/10.1016/j.neuron.2011.02.027

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Two-Step Task, there should be 4 columns of data with the labels “subjID”, “level1_choice”, “level2_choice”, “reward”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “level1_choice”: Choice made for Level (Stage) 1 (1: stimulus 1, 2: stimulus 2).

  • “level2_choice”: Choice made for Level (Stage) 2 (1: stimulus 3, 2: stimulus 4, 3: stimulus 5, 4: stimulus 6).

    Note that, in our notation, choosing stimulus 1 in Level 1 leads to stimulus 3 & 4 in Level 2 with a common (0.7 by default) transition. Similarly, choosing stimulus 2 in Level 1 leads to stimulus 5 & 6 in Level 2 with a common (0.7 by default) transition. To change this default transition probability, set the function argument trans_prob to your preferred value.

  • “reward”: Reward after Level 2 (0 or 1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “level1_choice”, “level2_choice”, “reward”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • trans_prob: Common state transition probability from Stage (Level) 1 to Stage (Level) 2. Defaults to 0.7.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ts_par6’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ts_par6

# Run the model and store results in "output"
output = ts_par6(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ts_par7(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Two-Step Task - Hybrid Model, with 7 parameters (original model)

Hierarchical Bayesian Modeling of the Two-Step Task [Daw2011] using Hybrid Model, with 7 parameters (original model) [Daw2011] with the following parameters: “a1” (learning rate in stage 1), “beta1” (inverse temperature in stage 1), “a2” (learning rate in stage 2), “beta2” (inverse temperature in stage 2), “pi” (perseverance), “w” (model-based weight), “lambda” (eligibility trace).

Daw2011

Daw, N. D., Gershman, S. J., Seymour, B., Ben Seymour, Dayan, P., & Dolan, R. J. (2011). Model-Based Influences on Humans’ Choices and Striatal Prediction Errors. Neuron, 69(6), 1204-1215. https://doi.org/10.1016/j.neuron.2011.02.027

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Two-Step Task, there should be 4 columns of data with the labels “subjID”, “level1_choice”, “level2_choice”, “reward”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “level1_choice”: Choice made for Level (Stage) 1 (1: stimulus 1, 2: stimulus 2).

  • “level2_choice”: Choice made for Level (Stage) 2 (1: stimulus 3, 2: stimulus 4, 3: stimulus 5, 4: stimulus 6).

    Note that, in our notation, choosing stimulus 1 in Level 1 leads to stimulus 3 & 4 in Level 2 with a common (0.7 by default) transition. Similarly, choosing stimulus 2 in Level 1 leads to stimulus 5 & 6 in Level 2 with a common (0.7 by default) transition. To change this default transition probability, set the function argument trans_prob to your preferred value.

  • “reward”: Reward after Level 2 (0 or 1).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “level1_choice”, “level2_choice”, “reward”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – For this model, it’s possible to set the following model-specific argument to a value that you may prefer.

    • trans_prob: Common state transition probability from Stage (Level) 1 to Stage (Level) 2. Defaults to 0.7.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ts_par7’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ts_par7

# Run the model and store results in "output"
output = ts_par7(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ug_bayes(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Norm-Training Ultimatum Game - Ideal Observer Model

Hierarchical Bayesian Modeling of the Norm-Training Ultimatum Game using Ideal Observer Model [Xiang2013] with the following parameters: “alpha” (envy), “beta” (guilt), “tau” (inverse temperature).

Xiang2013

Xiang, T., Lohrenz, T., & Montague, P. R. (2013). Computational Substrates of Norms and Their Violations during Social Exchange. Journal of Neuroscience, 33(3), 1099-1108. https://doi.org/10.1523/JNEUROSCI.1642-12.2013

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Norm-Training Ultimatum Game, there should be 3 columns of data with the labels “subjID”, “offer”, “accept”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “offer”: Floating point value representing the offer made in that trial (e.g. 4, 10, 11).

  • “accept”: 1 or 0, indicating whether the offer was accepted in that trial (where accepted == 1, rejected == 0).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “offer”, “accept”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ug_bayes’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ug_bayes

# Run the model and store results in "output"
output = ug_bayes(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.ug_delta(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Norm-Training Ultimatum Game - Rescorla-Wagner (Delta) Model

Hierarchical Bayesian Modeling of the Norm-Training Ultimatum Game using Rescorla-Wagner (Delta) Model [Gu2015] with the following parameters: “alpha” (envy), “tau” (inverse temperature), “ep” (norm adaptation rate).

Gu2015

Gu, X., Wang, X., Hula, A., Wang, S., Xu, S., Lohrenz, T. M., et al. (2015). Necessary, Yet Dissociable Contributions of the Insular and Ventromedial Prefrontal Cortices to Norm Adaptation: Computational and Lesion Evidence in Humans. Journal of Neuroscience, 35(2), 467-473. https://doi.org/10.1523/JNEUROSCI.2906-14.2015

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Norm-Training Ultimatum Game, there should be 3 columns of data with the labels “subjID”, “offer”, “accept”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “offer”: Floating point value representing the offer made in that trial (e.g. 4, 10, 11).

  • “accept”: 1 or 0, indicating whether the offer was accepted in that trial (where accepted == 1, rejected == 0).

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “offer”, “accept”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘ug_delta’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import ug_delta

# Run the model and store results in "output"
output = ug_delta(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)
hbayesdm.models.wcs_sql(data=None, niter=4000, nwarmup=1000, nchain=4, ncore=1, nthin=1, inits='vb', ind_pars='mean', model_regressor=False, vb=False, inc_postpred=False, adapt_delta=0.95, stepsize=1, max_treedepth=10, **additional_args)

Wisconsin Card Sorting Task - Sequential Learning Model

Hierarchical Bayesian Modeling of the Wisconsin Card Sorting Task using Sequential Learning Model [Bishara2010] with the following parameters: “r” (reward sensitivity), “p” (punishment sensitivity), “d” (decision consistency or inverse temperature).

Bishara2010

Bishara, A. J., Kruschke, J. K., Stout, J. C., Bechara, A., McCabe, D. P., & Busemeyer, J. R. (2010). Sequential learning models for the Wisconsin card sort task: Assessing processes in substance dependent individuals. Journal of Mathematical Psychology, 54(1), 5-13.

User data should contain the behavioral data-set of all subjects of interest for the current analysis. When loading from a file, the datafile should be a tab-delimited text file, whose rows represent trial-by-trial observations and columns represent variables.

For the Wisconsin Card Sorting Task, there should be 3 columns of data with the labels “subjID”, “choice”, “outcome”. It is not necessary for the columns to be in this particular order; however, it is necessary that they be labeled correctly and contain the information below:

  • “subjID”: A unique identifier for each subject in the data-set.

  • “choice”: Integer value indicating which deck was chosen on that trial: 1, 2, 3, or 4.

  • “outcome”: 1 or 0, indicating the outcome of that trial: correct == 1, wrong == 0.

Note

User data may contain other columns of data (e.g. ReactionTime, trial_number, etc.), but only the data within the column names listed above will be used during the modeling. As long as the necessary columns mentioned above are present and labeled correctly, there is no need to remove other miscellaneous data columns.

Note

adapt_delta, stepsize, and max_treedepth are advanced options that give the user more control over Stan’s MCMC sampler. It is recommended that only advanced users change the default values, as alterations can profoundly change the sampler’s behavior. See [Hoffman2014] for more information on the sampler control parameters. One can also refer to ‘Section 34.2. HMC Algorithm Parameters’ of the Stan User’s Guide and Reference Manual.

Hoffman2014

Hoffman, M. D., & Gelman, A. (2014). The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), 1593-1623.

Parameters
  • data – Data to be modeled. It should be given as a Pandas DataFrame object, a filepath for a data file, or "example" for example data. Data columns should be labeled as: “subjID”, “choice”, “outcome”.

  • niter – Number of iterations, including warm-up. Defaults to 4000.

  • nwarmup – Number of iterations used for warm-up only. Defaults to 1000.

    nwarmup is a numerical value that specifies how many MCMC samples should not be stored upon the beginning of each chain. For those familiar with Bayesian methods, this is equivalent to burn-in samples. Due to the nature of the MCMC algorithm, initial values (i.e., where the sampling chains begin) can have a heavy influence on the generated posterior distributions. The nwarmup argument can be set to a higher number in order to curb the effects that initial values have on the resulting posteriors.

  • nchain – Number of Markov chains to run. Defaults to 4.

    nchain is a numerical value that specifies how many chains (i.e., independent sampling sequences) should be used to draw samples from the posterior distribution. Since the posteriors are generated from a sampling process, it is good practice to run multiple chains to ensure that a reasonably representative posterior is attained. When the sampling is complete, it is possible to check the multiple chains for convergence by running the following line of code:

    output.plot(type='trace')
    
  • ncore – Number of CPUs to be used for running. Defaults to 1.

  • nthin – Every nthin-th sample will be used to generate the posterior distribution. Defaults to 1. A higher number can be used when auto-correlation within the MCMC sampling is high.

    nthin is a numerical value that specifies the “skipping” behavior of the MCMC sampler. That is, only every nthin-th sample is used to generate posterior distributions. By default, nthin is equal to 1, meaning that every sample is used to generate the posterior.

  • inits – String or list specifying how the initial values should be generated. Options are 'fixed' or 'random', or your own initial values.

  • ind_pars – String specifying how to summarize the individual parameters. Current options are: 'mean', 'median', or 'mode'.

  • model_regressor – Whether to export model-based regressors. Currently not available for this model.

  • vb – Whether to use variational inference to approximately draw from a posterior distribution. Defaults to False.

  • inc_postpred – Include trial-level posterior predictive simulations in model output (may greatly increase file size). Defaults to False.

  • adapt_delta – Floating point value representing the target acceptance probability of a new sample in the MCMC chain. Must be between 0 and 1. See note below.

  • stepsize – Integer value specifying the size of each leapfrog step that the MCMC sampler can take on each new iteration. See note below.

  • max_treedepth – Integer value specifying how many leapfrog steps the MCMC sampler can take on each new iteration. See note below.

  • **additional_args – Not used for this model.

Returns

model_data – An hbayesdm.TaskModel instance with the following components:

  • model: String value that is the name of the model (‘wcs_sql’).

  • all_ind_pars: Pandas DataFrame containing the summarized parameter values (as specified by ind_pars) for each subject.

  • par_vals: OrderedDict holding the posterior samples over different parameters.

  • fit: A PyStan StanFit object that contains the fitted Stan model.

  • raw_data: Pandas DataFrame containing the raw data used to fit the model, as specified by the user.

Examples

from hbayesdm import rhat, print_fit
from hbayesdm.models import wcs_sql

# Run the model and store results in "output"
output = wcs_sql(data='example', niter=2000, nwarmup=1000, nchain=4, ncore=4)

# Visually check convergence of the sampling chains (should look like "hairy caterpillars")
output.plot(type='trace')

# Plot posterior distributions of the hyper-parameters (distributions should be unimodal)
output.plot()

# Check Rhat values (all Rhat values should be less than or equal to 1.1)
rhat(output, less=1.1)

# Show the LOOIC and WAIC model fit estimates
print_fit(output)