SkyNet module¶

A python wrapper for SkyNet neural network that can de downloaded here : http://ccpforge.cse.rl.ac.uk/gf/project/skynet/

SkyNet is an efficient and robust neural network training code for machine learning. It is able to train large and deep feed-forward neural networks, including autoencoders, for use in a wide range of supervised and unsupervised learning applications, such as regression, classification, density estimation, clustering and dimensionality reduction. SkyNet is implemented in C/C++ and fully parallelised using MPI.

SkyNet is written by Philip Graff, Farhan Feroz, Michael P. Hobson and Anthony N. Lasenby

reference : http://xxx.lanl.gov/abs/1309.0790

class SkyNet.SkyNetClassifier(id, classification_network=True, input_root='./train_valid/', output_root='./network/', result_root='./predictions/', config_root='./config_files/', layers=(10, 10, 10), activation=(2, 2, 2, 0), prior=True, confidence_rate=0.3, confidence_rate_minimum=0.02, iteration_print_frequency=50, max_iter=2000, whitenin=True, whitenout=True, noise_scaling=0, set_whitened_noise=False, sigma=0.035, fix_seed=False, fixed_seed=0, calculate_evidence=True, historic_maxent=False, recurrent=False, convergence_function=4, validation_data=True, verbose=2, pretrain=False, nepoch=10, line_search=0, mini_batch_fraction=1.0, resume=False, norbias=False, reset_alpha=False, reset_sigma=False, randomise_weights=0.1, n_jobs=1)¶

Bases: SkyNet.SkyNet

A neural net classifier.

This class calls Skynet as a classifier.

Parameters:

Parameters:	id : string, compulsory This is a base id used to as an identifier. All files written by Skynet will contain id in the file-name. input_root : string, optional (default=custom) The folder where SkyNet-wrapper will write and SkyNet wil look for the train and validation files. output_root : string, optional (default=custom) The folder where SkyNet will write the network file (i.e the trained weights) result_root : string, optional (default=custom) The folder where SkyNet will write prediction files. config_root : string, optional (default=custom) The folder where SkyNet will write the config file that it uses to train. layers : tuple , optional (default=(10,10,10)) The amount of hidden layers and the amount of nodes per hidden layer. Default is 3 hidden layers with 10 nodes in each layer. activation : tuple =, optional (default=(2,2,2,0)) Which activation function to use per layer: 0 = linear 1 = sigmoid 2 = tanh 3 = rectified linear 4 = sofsign Needs to have len(layers) + 1 as the activation of the final layer needs to be set. prior : boolean, optional (default =True) Use L2 weight regularization. Strongly advised. mini-batch_fraction : float, optional(default=1.0) What fraction of training data to be used in each batch validation_data : bool, optional (default = True) Is there validation data to test against? Strongly advise to use to prevent overfitting confidence_rate : float, optional (default=0.03) Initial learning rate Step size factor, higher values are more aggressive. confidence_rate_minimum : float, optional (default=0.02) minimum confidence rate allowed iteration_print_frequency : int, optional (default=50) Skynet feedback frequency max_iter : int, optional (default=2000) Maxium training epochs n_jobs : integer, optional (default=1) The number of jobs to run in parallel for ‘fit’. whitenin : integer, optional (default=1) Which input transformation to use: 0 = none 1 = min-max 2 = normal. whitenout : integer, optional (default=1) Which output transformation to use: 0 = none 1 = min-max 2 = normal. convergence_function : integer, optional (default=4) Which minimization function to use for convergence testing: 1 = log-posterior 2 = log-likelihood 3 = correlation 4 = error squared. historic_maxent : bool, optional (default=False) Experimental implementation of MemSys’s historic maxent option. line_search : int, optional (default = 0) Perform line search for optimal distance: 0 = none 1 = golden section 2 = linbcg lnsrch. noise_scaling : bool, optional (default = False) If noise level (standard deviation of outputs) is to be estimated. set_whitened_noise : bool, optional (default =False) Whether the noise is to be set on whitened data. sigma : float, optional (default = 0.3) Initial noise level, set on (un-)whitened data. fix_seed : bool, optional (default =False) Use a fixed seed? Useful for debugging and unit-test. fixed_seed : int, optional (default =0) Seed to use if fix_seed == True. resume : bool, optional (default = False) Resume from a previous job. reset_alpha : bool, optional (default = False) Reset hyperparameter upon resume. reset_sigma : bool, optional (default = False) reset hyperparameters upon resume. randomise_weights : float, optional (default = 0.01) Random factor to add to saved weights upon resume. verbose : int, optional (default=2) Verbosity level of feedback sent to stdout by SkyNet (0=min, 3=max). pretrain : bool, Perform pre-training using restricted BM. nepoch : int, optional (default=10) Number of epochs to use in pre-training.

id : string, compulsory

This is a base id used to as an identifier. All files written by Skynet will contain id in the file-name.

input_root : string, optional (default=custom)

The folder where SkyNet-wrapper will write and SkyNet wil look for the train and validation files.

output_root : string, optional (default=custom)

The folder where SkyNet will write the network file (i.e the trained weights)

result_root : string, optional (default=custom)

The folder where SkyNet will write prediction files.

config_root : string, optional (default=custom)

The folder where SkyNet will write the config file that it uses to train.

layers : tuple , optional (default=(10,10,10))

The amount of hidden layers and the amount of nodes per hidden layer. Default is 3 hidden layers with 10 nodes in each layer.

activation : tuple =, optional (default=(2,2,2,0))

Which activation function to use per layer: 0 = linear 1 = sigmoid 2 = tanh 3 = rectified linear 4 = sofsign Needs to have len(layers) + 1 as the activation of the final layer needs to be set.

prior : boolean, optional (default =True)

Use L2 weight regularization. Strongly advised.

mini-batch_fraction : float, optional(default=1.0)

What fraction of training data to be used in each batch

validation_data : bool, optional (default = True)

Is there validation data to test against? Strongly advise to use to prevent overfitting

confidence_rate : float, optional (default=0.03)

Initial learning rate Step size factor, higher values are more aggressive.

confidence_rate_minimum : float, optional (default=0.02)

minimum confidence rate allowed

iteration_print_frequency : int, optional (default=50)

Skynet feedback frequency

max_iter : int, optional (default=2000)

Maxium training epochs

n_jobs : integer, optional (default=1)

The number of jobs to run in parallel for ‘fit’.

whitenin : integer, optional (default=1)

Which input transformation to use: 0 = none 1 = min-max 2 = normal.

whitenout : integer, optional (default=1)

Which output transformation to use: 0 = none 1 = min-max 2 = normal.

convergence_function : integer, optional (default=4)

Which minimization function to use for convergence testing: 1 = log-posterior 2 = log-likelihood 3 = correlation 4 = error squared.

historic_maxent : bool, optional (default=False)

Experimental implementation of MemSys’s historic maxent option.

line_search : int, optional (default = 0)

Perform line search for optimal distance: 0 = none 1 = golden section 2 = linbcg lnsrch.

noise_scaling : bool, optional (default = False)

If noise level (standard deviation of outputs) is to be estimated.

set_whitened_noise : bool, optional (default =False)

Whether the noise is to be set on whitened data.

sigma : float, optional (default = 0.3)

Initial noise level, set on (un-)whitened data.

fix_seed : bool, optional (default =False)

Use a fixed seed? Useful for debugging and unit-test.

fixed_seed : int, optional (default =0)

Seed to use if fix_seed == True.

resume : bool, optional (default = False)

Resume from a previous job.

reset_alpha : bool, optional (default = False)

Reset hyperparameter upon resume.

reset_sigma : bool, optional (default = False)

reset hyperparameters upon resume.

randomise_weights : float, optional (default = 0.01)

Random factor to add to saved weights upon resume.

verbose : int, optional (default=2)

Verbosity level of feedback sent to stdout by SkyNet (0=min, 3=max).

pretrain : bool,

Perform pre-training using restricted BM.

nepoch : int, optional (default=10)

Number of epochs to use in pre-training.

See also

SkyNetRegressor

References

[R1]	SKYNET: an efficient and robust neural network training tool for machine learning in astronomy http://arxiv.org/abs/1309.0790

Attributes

n_features	( int) The number of features.
train_input_file	(string) Filename of the written training file.
valid_input_file	(string.) Filename of the written validation file.
SkyNet_config_file	(string) Filename of SkyNet config file.
network_file	(string) Filename of SkyNet network file. This file contains the trained weights:

predict_proba(X)¶

Predict class probabilities for X.

The predicted class probabilities of an input sample is computed as by trained neural network

Parameters:

Parameters:	X : array-like of shape = [n_samples, n_features] The input samples.
Returns:	p : array of shape = [n_samples,n_classes] The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

X : array-like of shape = [n_samples, n_features]

The input samples.

Returns:

p : array of shape = [n_samples,n_classes]

The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

Attributes

output_file

(String) SkyNet writes to this file: result_root + self.id + _predictions.txt.

class SkyNet.SkyNetRegressor(id, classification_network=False, input_root='./train_valid/', output_root='./network/', result_root='./predictions/', config_root='./config_files/', layers=(10, 10, 10), activation=(2, 2, 2, 0), prior=True, confidence_rate=0.3, confidence_rate_minimum=0.02, iteration_print_frequency=50, max_iter=2000, whitenin=True, whitenout=True, noise_scaling=0, set_whitened_noise=False, sigma=0.035, fix_seed=False, fixed_seed=0, calculate_evidence=True, historic_maxent=False, recurrent=False, convergence_function=4, validation_data=True, verbose=2, pretrain=False, nepoch=10, line_search=0, mini_batch_fraction=1.0, resume=False, norbias=False, reset_alpha=False, reset_sigma=False, randomise_weights=0.1, n_jobs=1)¶

Bases: SkyNet.SkyNet

A neural net regeressor.

Parameters:

Parameters:	id : string, compulsory This is a base id used to as an identifier. input_root : string, optional (default=custom) The folder where SkyNet-wrapper will write and SkyNet wil look for the train and validation files. output_root : string, optional (default=custom) The folder where SkyNet will write the network file (i.e the trained weights) result_root : string, optional (default=custom) The folder where SkyNet will write prediction files. config_root : string, optional (default=custom) The folder where SkyNet will write the config file that it uses to train. This parameter is best adjusted in SkyNet.py layers : tuple , optional (default=(10,10,10)) The amount of hidden layers and the amount of nodes per hidden layer. Default is 3 hidden layers with 10 nodes in each layer. activation : tuple =, optional (default=(2,2,2,0)) Which activation function to use per layer: 0 = linear 1 = sigmoid 2 = tanh 3 = rectified linear 4 = sofsign Needs to have len(layers) + 1 as the activation of the final layer needs to be set. prior : boolean, optional (default =True) Use L2 weight regularization. Strongly advised. mini-batch_fraction : float, optional(default=1.0) What fraction of training data to be used in each batch validation_data : bool, optional (default = True) Is there validation data to test against? Strongly advise to use to prevent overfitting confidence_rate : float, optional (default=0.03) Initial learing rate Step size factor, higher values are more aggressive. confidence_rate_minimum : float, optional (default=0.02) minimum confidence rate allowed iteration_print_frequency : int, optional (default=50) Skynet feedback frequency max_iter : int, optional (default=2000) Maxium training epochs n_jobs : integer, optional (default=1) The number of jobs to run in parallel for ‘fit’. whitenin : integer, optional (default=1) Which input transformation to use: 0 = none 1 = min-max 2 = normal. whitenout : integer, optional (default=1) Which output transformation to use: 0 = none 1 = min-max 2 = normal. convergence_function : integer, optional (default=4) Which minimization function to use for convergence testing: 1 = log-posterior 2 = log-likelihood 3 = correlation 4 = error squared. historic_maxent : bool, optional (default=False) Experimental implementation of MemSys’s historic maxent option. line_search : int, optional (default = 0) Perform line search for optimal distance: 0 = none 1 = golden section 2 = linbcg lnsrch. noise_scaling : bool, optional (default = False) If noise level (standard deviation of outputs) is to be estimated. set_whitened_noise : bool, optional (default =False) Whether the noise is to be set on whitened data. sigma : float, optional (default = 0.3) Initial noise level, set on (un-)whitened data. fix_seed : bool, optional (default =False) Use a fixed seed? Usefull for debugging and unit-test. fixed_seed : int, optional (default =0) Seed to use if fix_seed == True. resume : bool, optional (default = False) Resume from a previous job. reset_alpha : bool, optional (default = False) Reset hyperparameter upon resume. reset_sigma : bool, optional (default = False) reset hyperparameters upon resume. randomise_weights : float, optional (default = 0.01) Random factor to add to saved weights upon resume. verbose : int, optional (default=2) Verbosity level of feedback sent to stdout by SkyNet (0=min, 3=max). pretrain : bool, Perform pre-training using restricted BM. nepoch : int, optional (default=10) Number of epochs to use in pre-training.

id : string, compulsory

This is a base id used to as an identifier.

input_root : string, optional (default=custom)

The folder where SkyNet-wrapper will write and SkyNet wil look for the train and validation files.

output_root : string, optional (default=custom)

The folder where SkyNet will write the network file (i.e the trained weights)

result_root : string, optional (default=custom)

The folder where SkyNet will write prediction files.

config_root : string, optional (default=custom)

The folder where SkyNet will write the config file that it uses to train. This parameter is best adjusted in SkyNet.py

layers : tuple , optional (default=(10,10,10))

The amount of hidden layers and the amount of nodes per hidden layer. Default is 3 hidden layers with 10 nodes in each layer.

activation : tuple =, optional (default=(2,2,2,0))

Which activation function to use per layer: 0 = linear 1 = sigmoid 2 = tanh 3 = rectified linear 4 = sofsign Needs to have len(layers) + 1 as the activation of the final layer needs to be set.

prior : boolean, optional (default =True)

Use L2 weight regularization. Strongly advised.

mini-batch_fraction : float, optional(default=1.0)

What fraction of training data to be used in each batch

validation_data : bool, optional (default = True)

Is there validation data to test against? Strongly advise to use to prevent overfitting

confidence_rate : float, optional (default=0.03)

Initial learing rate Step size factor, higher values are more aggressive.

confidence_rate_minimum : float, optional (default=0.02)

minimum confidence rate allowed

iteration_print_frequency : int, optional (default=50)

Skynet feedback frequency

max_iter : int, optional (default=2000)

Maxium training epochs

n_jobs : integer, optional (default=1)

The number of jobs to run in parallel for ‘fit’.

whitenin : integer, optional (default=1)

Which input transformation to use: 0 = none 1 = min-max 2 = normal.

whitenout : integer, optional (default=1)

Which output transformation to use: 0 = none 1 = min-max 2 = normal.

convergence_function : integer, optional (default=4)

Which minimization function to use for convergence testing: 1 = log-posterior 2 = log-likelihood 3 = correlation 4 = error squared.

historic_maxent : bool, optional (default=False)

Experimental implementation of MemSys’s historic maxent option.

line_search : int, optional (default = 0)

Perform line search for optimal distance: 0 = none 1 = golden section 2 = linbcg lnsrch.

noise_scaling : bool, optional (default = False)

If noise level (standard deviation of outputs) is to be estimated.

set_whitened_noise : bool, optional (default =False)

Whether the noise is to be set on whitened data.

sigma : float, optional (default = 0.3)

Initial noise level, set on (un-)whitened data.

fix_seed : bool, optional (default =False)

Use a fixed seed? Usefull for debugging and unit-test.

fixed_seed : int, optional (default =0)

Seed to use if fix_seed == True.

resume : bool, optional (default = False)

Resume from a previous job.

reset_alpha : bool, optional (default = False)

Reset hyperparameter upon resume.

reset_sigma : bool, optional (default = False)

reset hyperparameters upon resume.

randomise_weights : float, optional (default = 0.01)

Random factor to add to saved weights upon resume.

verbose : int, optional (default=2)

Verbosity level of feedback sent to stdout by SkyNet (0=min, 3=max).

pretrain : bool,

Perform pre-training using restricted BM.

nepoch : int, optional (default=10)

Number of epochs to use in pre-training.