Parameters: | x - symbolic Tensor (or compatible) |
---|---|
Return type: | same as x |
Returns: | element-wise sigmoid: |
note: | see ultra_fast_sigmoid() or hard_sigmoid() for faster versions. Speed comparison for 100M float64 elements on a Core2 Duo @ 3.16 GHz:
Precision: sigmoid(without or without amdlibm) > ultra_fast_sigmoid > hard_sigmoid. |
Example:
x,y,b = T.dvectors('x','y','b')
W = T.dmatrix('W')
y = T.nnet.sigmoid(T.dot(W,x) + b)
Note
The underlying code will return an exact 0 or 1 if an element of x is too small or too big.
Parameters: | x - symbolic Tensor (or compatible) |
---|---|
Return type: | same as x |
Returns: | approximated element-wise sigmoid: ![]() |
note: | To automatically change all sigmoid() ops to this version, use the Theano optimization local_ultra_fast_sigmoid. This can be done with the Theano flag optimizer_including=local_ultra_fast_sigmoid. This optimization is done late, so it should not affect stabilization optimization. |
Note
The underlying code will return 0.00247262315663 as the minimum value and 0.997527376843 as the maximum value. So it never returns 0 or 1.
Note
Using directly the ultra_fast_sigmoid in the graph will disable stabilization optimization associated with it. But using the optimization to insert them won’t disable the stability optimization.
Parameters: | x - symbolic Tensor (or compatible) |
---|---|
Return type: | same as x |
Returns: | approximated element-wise sigmoid: ![]() |
note: | To automatically change all sigmoid() ops to this version, use the Theano optimization local_hard_sigmoid. This can be done with the Theano flag optimizer_including=local_hard_sigmoid. This optimization is done late, so it should not affect stabilization optimization. |
Note
The underlying code will return an exact 0 or 1 if an element of x is too small or too big.
Note
Using directly the ultra_fast_sigmoid in the graph will disable stabilization optimization associated with it. But using the optimization to insert them won’t disable the stability optimization.
Parameter: | x - symbolic Tensor (or compatible) |
---|---|
Return type: | same as x |
Returns: | elementwise softplus: ![]() |
Note
The underlying code will return an exact 0 if an element of x is too small.
x,y,b = T.dvectors('x','y','b')
W = T.dmatrix('W')
y = T.nnet.softplus(T.dot(W,x) + b)
Parameter: | x symbolic 2D Tensor (or compatible). |
---|---|
Return type: | same as x |
Returns: | a symbolic 2D tensor whose ijth element is ![]() |
The softmax function will, when applied to a matrix, compute the softmax values row-wise.
note: this insert a particular op. But this op don’t yet implement the Rop for hessian free. If you want that, implement this equivalent code that have the Rop implemented exp(x)/exp(x).sum(1, keep_dims=True). Theano should optimize this by inserting the softmax op itself. The code of the softmax op is more numeriacaly stable by using this code:
e_x = exp(x - x.max(axis=1, keep_dims=True)) out = e_x / e_x.sum(axis=1, keep_dims=True)
Example of use:
x,y,b = T.dvectors('x','y','b')
W = T.dmatrix('W')
y = T.nnet.softmax(T.dot(W,x) + b)
Parameters: |
|
---|---|
Return type: | same as target |
Returns: | a symbolic tensor, where the following is applied elementwise |
The following block implements a simple auto-associator with a sigmoid nonlinearity and a reconstruction error which corresponds to the binary cross-entropy (note that this assumes that x will contain values between 0 and 1):
x, y, b = T.dvectors('x', 'y', 'b')
W = T.dmatrix('W')
h = T.nnet.sigmoid(T.dot(W, x) + b)
x_recons = T.nnet.sigmoid(T.dot(V, h) + c)
recon_cost = T.nnet.binary_crossentropy(x_recons, x).mean()
Return the cross-entropy between an approximating distribution and a true distribution. The cross entropy between two probability distributions measures the average number of bits needed to identify an event from a set of possibilities, if a coding scheme is used based on a given probability distribution q, rather than the “true” distribution p. Mathematically, this function computes
, where p=true_dist and q=coding_dist.
Parameters:
- coding_dist - symbolic 2D Tensor (or compatible). Each row represents a distribution.
- true_dist - symbolic 2D Tensor OR symbolic vector of ints. In the case of an integer vector argument, each element represents the position of the ‘1’ in a 1-of-N encoding (aka “one-hot” encoding)
Return type: tensor of rank one-less-than coding_dist
Note
An application of the scenario where true_dist has a 1-of-N representation is in classification with softmax outputs. If coding_dist is the output of the softmax and true_dist is a vector of correct labels, then the function will compute y_i = - \log(coding_dist[i, one_of_n[i]]), which corresponds to computing the neg-log-probability of the correct class (which is typically the training criterion in classification settings).
y = T.nnet.softmax(T.dot(W, x) + b)
cost = T.nnet.categorical_crossentropy(y, o)
# o is either the above-mentioned 1-of-N vector or 2D tensor