glhmm.prediction¶
Prediction from Gaussian Linear Hidden Markov Model @author: Christine Ahrends 2023
- glhmm.prediction.classify_phenotype(hmm, Y, behav, indices, predictor='Fisherkernel', estimator='SVM', options=None)[source]¶
Classify phenotype from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to make a classification, in a nested cross-validated way. By default, X is standardised/centered. Estimators so far include: SVM and Logistic Regression Cross-validation strategies so far include: KFold and GroupKFold Hyperparameter optimization strategies so far include: only grid search
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data
- behavarray-like of shape (n_sessions,)
phenotype, behaviour, or other external labels to be predicted
- indicesarray-like of shape (n_sessions, 2)
The start and end indices of each trial/session in the input data. Note that this function does not work if indices=None
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘SVM’)
Model to be used for classification (default=’SVM’) This should be the name of a sklearn base estimator (for now either ‘SVM’ or ‘LogisticRegression’)
- optionsdict (optional, default to None)
- general relevant options are:
‘CVscheme’: char, which CVscheme to use (default: ‘GroupKFold’ if group structure is specified, otherwise: KFold) ‘nfolds’: int, number of folds k for (outer and inner) k-fold CV loops ‘group_structure’: ndarray of (n_sessions, n_sessions), matrix specifying group structure: positive values if sessions(/subjects) are related, zeros otherwise ‘return_scores’: bool, whether to return also the model scores of each fold ‘return_models’: bool, whether to return also the trained models of each fold ‘return_hyperparams’: bool, whether to return also the optimised hyperparameters of each fold possible hyperparameters for model, e.g. ‘alpha’ for (kernel) ridge regression ‘return_prob’: bool, whether to return also the estimated probabilities
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- resultsdict
containing ‘behav_pred’: predicted labels on test sets ‘acc’: overall accuracy (if requested): ‘behav_prob’: predicted probabilities of each class on test set ‘scores’: the model scores of each fold ‘models’: the trained models from each fold ‘hyperparams’: the optimised hyperparameters of each fold
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
Notes:¶
If behav contains NaNs, these subjects/sessions will be removed in Y and confounds
- glhmm.prediction.compute_gradient(hmm, Y, incl_Pi=True, incl_P=True, incl_Mu=False, incl_Sigma=True)[source]¶
Computes the gradient of the log-likelihood for timeseries Y with respect to specified HMM parameters
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(subject- or session-level) timeseries data
- incl_Pibool, default=True
whether to compute gradient w.r.t state probabilities
- incl_Pbool, default=True
whether to compute gradient w.r.t. transition probabilities
- incl_Mubool, default=False
whether to compute gradient w.r.t state means (only possible if state means were estimated during training)
- incl_Sigmabool, default=False
whether to compute gradient w.r.t. state covariances (for now only for full covariance matrix)
Returns:¶
hmmgrad : array of shape (sum(len(requested_parameters)))
Raises:¶
- Exception
If the model has not been trained or if requested parameters do not exist (e.g. if Mu is requested but state means were not estimated)
Notes:¶
Does not include gradient computation for X and beta
- glhmm.prediction.get_groups(group_structure)[source]¶
Util function to get groups from group structure matrix such as family structure. Output can be used to make sure groups/families are not split across folds during cross validation, e.g. using sklearn’s GroupKFold. Groups are defined as components in the adjacency matrix.
Parameter:¶
- group_structurearray-like of shape (n_sessions, n_sessions)
a matrix specifying the structure of the dataset, with positive values indicating relations between sessions(/subjects) and zeros indicating no relations. Note: The diagonal will be set to 1
Returns:¶
- csarray-like of shape (n_sessions,)
1D array containing the group each session belongs to
- glhmm.prediction.get_summ_features(hmm, Y, indices, metrics)[source]¶
Util function to get summary features from HMM. Output can be used as input features for ML
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data
- indicesarray-like of shape (n_sessions, 2)
The start and end indices of each trial/session in the input data. Note that kernel cannot be computed if indices=None
- metricslist
names of metrics to be extracted. For now, this should be one or more of ‘FO’, ‘switching_rate’, ‘lifetimes’
Returns:¶
- featuresarray-like of shape (n_sessions, n_features)
The HMM summary metrics collected into a feature matrix
- glhmm.prediction.hmm_kernel(hmm, Y, indices, type='Fisher', shape='linear', incl_Pi=True, incl_P=True, incl_Mu=False, incl_Sigma=True, tau=None, return_feat=False, return_dist=False)[source]¶
Constructs a kernel from an HMM, as well as the respective feature matrix and/or distance matrix
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data
- indicesarray-like of shape (n_sessions, 2)
The start and end indices of each trial/session in the input data. Note that kernel cannot be computed if indices=None
- typestr, optional
The type of kernel to be constructed (default: ‘Fisher’)
- shapestr, optional
The shape of kernel to be constructed, either ‘linear’ or ‘Gaussian’ (default: ‘linear’)
- incl_Pibool, default=True
whether to include state probabilities in kernel construction
- incl_Pbool, default=True
whether to include transition probabilities in kernel construction
- incl_Mubool, default=False
whether to include state means in kernel construction (only possible if state means were estimated during training)
- incl_Sigmabool, default=False
whether to include state covariances in kernel construction (for now only for full covariance matrix)
- return_featbool, default=False
whether to return also the feature matrix
- return_distbool, default=False
whether to return also the distance matrix
Returns:¶
- kernelarray of shape (n_sessions, n_sessions)
HMM Kernel for subjects/sessions contained in Y
- featarray of shape (n_sessions, sum(len(requested_parameters)))
Feature matrix for subjects/sessions contained in Y for requested parameters
- distarray of shape (n_sessions, n_sessions)
Distance matrix for subjects/sessions contained in Y
Raises:¶
- Exception
If the hmm has not been trained or if requested parameters do not exist (e.g. if Mu is requested but state means were not estimated) If kernel other than Fisher kernel is requested
Notes:¶
Does not include X and beta in kernel construction Only Fisher kernel implemented at this point
- glhmm.prediction.predict_phenotype(hmm, Y, behav, indices, predictor='Fisherkernel', estimator='KernelRidge', options=None)[source]¶
Predict phenotype from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to predict a phenotype, in a nested cross-validated way. By default, X and Y are standardised/centered unless deconfounding is used. Estimators so far include: Kernel Ridge Regression and Ridge Regression Cross-validation strategies so far include: KFold and GroupKFold Hyperparameter optimization strategies so far include: only grid search
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data
- behavarray-like of shape (n_sessions,)
phenotype, behaviour, or other external variable to be predicted
- indicesarray-like of shape (n_sessions, 2)
The start and end indices of each trial/session in the input data. Note that this function does not work if indices=None
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘KernelRidge’)
Model to be used for prediction (default=’KernelRidge’) This should be the name of a sklearn base estimator (for now either ‘KernelRidge’ or ‘Ridge’)
- optionsdict (optional, default to None)
- general relevant options are:
‘CVscheme’: char, which CVscheme to use (default: ‘GroupKFold’ if group structure is specified, otherwise: KFold) ‘nfolds’: int, number of folds k for (outer and inner) k-fold CV loops ‘group_structure’: ndarray of (n_sessions, n_sessions), matrix specifying group structure: positive values if sessions(/subjects) are related, zeros otherwise ‘confounds’: array-like of shape (n_sessions,) or (n_sessions, n_confounds) containing confounding variables ‘return_scores’: bool, whether to return also the model scores of each fold ‘return_models’: bool, whether to return also the trained models of each fold ‘return_hyperparams’: bool, whether to return also the optimised hyperparameters of each fold possible hyperparameters for model, e.g. ‘alpha’ for (kernel) ridge regression
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- resultsdict
containing ‘behav_pred’: predicted phenotype on test sets ‘corr’: correlation coefficient between predicted and actual values (if requested): ‘scores’: the model scores of each fold ‘models’: the trained models from each fold ‘hyperparams’: the optimised hyperparameters of each fold
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
Notes:¶
If behav contains NaNs, these subjects/sessions will be removed in Y and confounds
- glhmm.prediction.test_classif(hmm, Y, indices, model_tuned, scaler_x, behav=None, train_indices=None, predictor='Fisherkernel', estimator='SVM', options=None)[source]¶
Test classification model from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to make a classification, in a nested cross-validated way. The specified predictor and estimator must be the same as the ones used to train the classifier. By default, X is standardised/centered. Note: When using a kernel method (e.g. Fisher kernel), Y must be the timeseries of both training and test set to construct the correct kernel, and indices of the training sessions (train_indices) must be provided. When using summary metrics, Y must be the timeseries of only the test set, and train_indices should be None.
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data of test set
- indicesarray-like of shape (n_test_sessions, 2) or (n_sessions, 2)
The start and end indices of each trial/session in the test data (when using features) or in the train and test data (when using kernel). Note that this function does not work if indices=None
- model_tunedestimator
the trained and (if applicable) hyperparameter-optimised scikit-learn estimator
- scaler_xestimator
the trained standard scaler/kernel centerer of the features/kernel x
- behavarray-like of shape (n_test_sessions,) (optional)
phenotype, behaviour, or other external label of test set, to be compared with the predicted labels
- train_indicesarray-like of shape (n_train_sessions,) (optional, only use when using kernel)
the indices of the sessions/subjects used for training. The function assumes that test indices are all other sessions.
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘SVM’)
Model to be used for classification (default=’SVM’) This should be the name of a sklearn base estimator (for now either ‘SVM’ or ‘LogisticRegression’)
- optionsdict (optional, default to None)
- general relevant options are:
‘return_prob’: bool, whether to return also the estimated probabilities ‘return_models’: whether to return also the model
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- resultsdict
containing ‘behav_pred’: predicted labels on test sets ‘acc’: overall accuracy (if requested): ‘behav_prob’: predicted probabilities of each class on test set ‘scores’: the model scores of each fold ‘models’: the trained model
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
- glhmm.prediction.test_pred(hmm, Y, indices, model_tuned, scaler_x, scaler_y=None, behav=None, train_indices=None, CinterceptY=None, CbetaY=None, predictor='Fisherkernel', estimator='KernelRidge', options=None)[source]¶
Test prediction model from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to predict a phenotype, in a nested cross-validated way. The specified predictor and estimator must be the same as the ones used to train the model. By default, X and Y are standardised/centered unless deconfounding is used. Note: When using a kernel method (e.g. Fisher kernel), Y must be the timeseries of both training and test set to construct the correct kernel, and indices of the training sessions (train_indices) must be provided. When using summary metrics, Y must be the timeseries of only the test set, and train_indices should be None. When using deconfounding, CinterceptY and CbetaY need to be specified
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data of test set
- indicesarray-like of shape (n_test_sessions, 2) or (n_sessions, 2)
The start and end indices of each trial/session in the test data (when using features) or in the train and test data (when using kernel). Note that this function does not work if indices=None
- model_tunedestimator
the trained and (if applicable) hyperparameter-optimised scikit-learn estimator
- scaler_xestimator
the trained standard scaler/kernel centerer of the features/kernel x
- scaler_yestimator (optional, only specify when not using deconfounding)
the trained standard scaler of the variable to be predicted y.
- behavarray-like of shape (n_test_sessions,) (optional)
phenotype, behaviour, or other external variable of test set, to be compared with the predicted values
- train_indicesarray-like of shape (n_train_sessions,) (optional, only use when using kernel)
the indices of the sessions/subjects used for training. The function assumes that test indices are all other sessions.
- CinterceptYfloat (optional, only specify when using deconfounding)
the estimated intercept for deconfounding
- CbetaYarray-like of shape (n_confounds) (optional, only specify when using deconfounding)
the estimated beta weights for deconfounding of each confound
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘KernelRidge’)
Model to be used for prediction (default=’KernelRidge’) This should be the name of a sklearn base estimator (for now either ‘KernelRidge’ or ‘Ridge’)
- optionsdict (optional, default to None)
- general relevant options are:
‘confounds’: array-like of shape (n_test_sessions,) or (n_test_sessions, n_confounds) containing confounding variables ‘return_models’: whether to return also the model
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- resultsdict
containing ‘behav_pred’: predicted phenotype on test sets (if behav was specified): ‘corr’: correlation coefficient between predicted and actual values ‘scores’: the model scores of each fold (if requested): ‘model’: the trained model
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
- glhmm.prediction.train_classif(hmm, Y, behav, indices, predictor='Fisherkernel', estimator='SVM', options=None)[source]¶
Train classification model from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to make a classification, in a nested cross-validated way. By default, X is standardised/centered. Note that all outputs need to be passed on to test_classif to ensure that training and test variables are preprocessed in the same way, while avoiding leakage between training and test set. Estimators so far include: SVM and Logistic Regression Cross-validation strategies so far include: KFold and GroupKFold Hyperparameter optimization strategies so far include: grid search, no hyperparameter optimisation
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data of training set
- behavarray-like of shape (n_train_sessions,)
phenotype, behaviour, or other external labels of training set to be predicted
- indicesarray-like of shape (n_train_sessions, 2)
The start and end indices of each trial/session in the training set. Note that this function does not work if indices=None
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘SVM’)
Model to be used for classification (default=’SVM’) This should be the name of a sklearn base estimator (for now either ‘SVM’ or ‘LogisticRegression’)
- optionsdict (optional, default to None)
- general relevant options are:
- ‘optim_hyperparam’char, which hyperparameter optimisation strategy to use (default: ‘GridSearchCV’).
If you don’t want to use hyperparameter optimisation, set this to None and specify the hyperparameter (alpha) as an option When using hyperparameter optimisation, additional relevant options are:
‘CVscheme’: char, which CVscheme to use (default: ‘GroupKFold’ if group structure is specified, otherwise: KFold) ‘nfolds’: int, number of folds k for (outer and inner) k-fold CV loops ‘group_structure’: ndarray of (n_train_sessions, n_train_sessions), matrix specifying group structure: positive values if samples are related, zeros otherwise
possible hyperparameters for model, e.g. ‘C’ for SVM ‘return_prob’: bool, whether to also estimate the probabilities
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- model_tunedestimator
the trained and (if applicable) hyperparameter-optimised scikit-learn estimator
- scaler_xestimator
the trained standard scaler/kernel centerer of the features/kernel x
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
Notes:¶
If behav contains NaNs, these subjects/sessions will be removed
- glhmm.prediction.train_pred(hmm, Y, behav, indices, predictor='Fisherkernel', estimator='KernelRidge', options=None)[source]¶
Train prediction model from HMM This uses either the Fisher kernel (default) or a set of HMM summary metrics to predict a phenotype, in a nested cross-validated way. By default, X and Y are standardised/centered unless deconfounding is used. Note that all outputs except behavD, i.e. model and scalers, need to be passed on to test_pred to ensure that training and test variables are preprocessed in the same way, while avoiding leakage between training and test set. Estimators so far include: Kernel Ridge Regression and Ridge Regression Cross-validation strategies so far include: KFold and GroupKFold Hyperparameter optimization strategies so far include: grid search, no hyperparameter optimisation
Parameters:¶
- hmmHMM object
An instance of the HMM class, estimated on the group-level
- Yarray-like of shape (n_samples, n_variables_2)
(group-level) timeseries data of training set
- behavarray-like of shape (n_train_sessions,)
phenotype, behaviour, or other external variable of training set
- indicesarray-like of shape (n_train_sessions, 2)
The start and end indices of each trial/session in the training data. Note that this function does not work if indices=None
- predictorchar (optional, default to ‘Fisherkernel’)
What to predict from, either ‘Fisherkernel’ or ‘summary_metrics’ (default=’Fisherkernel’)
- estimatorchar (optional, default to ‘KernelRidge’)
Model to be used for prediction (default=’KernelRidge’) This should be the name of a sklearn base estimator (for now either ‘KernelRidge’ or ‘Ridge’)
- optionsdict (optional, default to None)
- general relevant options are:
- ‘optim_hyperparam’char, which hyperparameter optimisation strategy to use (default: ‘GridSearchCV’).
If you don’t want to use hyperparameter optimisation, set this to None and specify the hyperparameter (alpha) as an option When using hyperparameter optimisation, additional relevant options are:
‘CVscheme’: char, which CVscheme to use (default: ‘GroupKFold’ if group structure is specified, otherwise: KFold) ‘nfolds’: int, number of folds k for (outer and inner) k-fold CV loops ‘group_structure’: ndarray of (n_train_sessions, n_train_sessions), matrix specifying group structure: positive values if samples are related, zeros otherwise
‘confounds’: array-like of shape (n_train_sessions,) or (n_train_sessions, n_confounds) containing confounding variables possible hyperparameters for model, e.g. ‘alpha’ for (kernel) ridge regression
- for Fisher kernel, relevant options are:
‘shape’: char, either ‘linear’ or ‘Gaussian’ (TO DO) ‘incl_Pi’: bool, whether to include the gradient w.r.t. the initial state probabilities when computing the Fisher kernel ‘incl_P’: bool, whether to include the gradient w.r.t. the transition probabilities ‘incl_Mu’: bool, whether to include the gradient w.r.t. the state means (note that this only works if means were not set to 0 when training HMM) ‘incl_Sigma’: bool, whether to include the gradient w.r.t. the state covariances
- for summary metrics, relevant options are:
‘metrics’: list of char, containing metrics to be included as features
Returns:¶
- model_tunedestimator
the trained and (if applicable) hyperparameter-optimised scikit-learn estimator
- scaler_xestimator
the trained standard scaler/kernel centerer of the features/kernel x
(if not using deconfounding): scaler_y : estimator
the trained standard scaler of the variable to be predicted y.
(if using deconfounding): CinterceptY : float
the estimated intercept for deconfounding
- CbetaYarray-like of shape (n_confounds)
the estimated beta weights for deconfounding of each confound
- behavDarray-like of shape (n_train_sessions)
the phenotype/behaviour in deconfounded space
Raises:¶
- Exception
If the hmm has not been trained or if necessary input is missing
Notes:¶
If behav contains NaNs, these subjects/sessions will be removed in Y and confounds