Wolfram Computation Meets Knowledge

0.00 Index

  1. A
  2. B
  3. C
  4. D
  5. E
  6. F
  7. G
  8. H
  9. I
  10. J
  11. K
  12. L
  13. M
  14. N
  15. O
  16. P
  17. Q
  18. R
  19. S
  20. T
  21. U
  22. V
  23. W
  24. X-Z

A

  • accuracy, 34, 38, 4446
  • activation function, 272273
  • active learning, 19
  • ad click prediction, 6
  • adaptive learning rate, 290291
  • additive smoothing
    •     see Laplace smoothing, 226
  • aleatoric uncertainty
    •     see inherent uncertainty, 109
  • anomaly, 13, 97, 112, 142143
  • anomaly detection, 13, 97, 142143, 152153, 171172, 178180
  • array, 193, 277
  • artificial intelligence, 5
  • artificial neural network
    •     see neural network, 271
  • attention layer, 351, 358
  • attention mechanism, 349354
  • attribute
    •     see variable, 5
  • audio data
    •     see audio processing, 4
  • audio processing, 4, 11, 323324
  • autoencoder, 154158
  • automated machine learning, 108
  • automatic differentiation
    •     see backpropagation, 301
  • autoregressive model, 336
    •     see sequence modeling, 4
  • auxiliary task
    •     see pretext task, 22

B

C

D

E

F

  • F1 score, 46
  • facial recognition, 127129, 192
  • fairness, 113
  • feature, 10
  • feature engineering
    •     see preprocessing, 115
  • feature extraction, 139
  • feature importance, 32, 3637, 70
  • feature learning
    •     see feature extraction, 139
  • feature space, 145
  • feature type, 34, 70
  • feature vector, 10, 85
  • feature visualization, 318321
  • feature-space plot, 13, 146149
  • feed-forward neural network
    •     see multilayer perceptron, 274
  • finite difference
    •     see numerical differentiation, 294
  • fraud detection, 5
  • fully connected layer, 274275
  • fully connected network
    •     see multilayer perceptron, 274

G

H

  • Hadamard matrix, 190
  • heavy-tailed distribution, 66, 186187
  • heteroscedastic noise, 64
  • hidden layer, 275
  • hidden Markov model, 265
  • hinge loss, 243
  • homoscedastic noise, 64
  • hyperparameter, 103, 109
  • hyperparameter optimization, 104108

I

  • iid assumption, 52, 9697, 111
  • image captioning, 353
  • image colorization, 22
  • image conformation, 191
  • image data
    •     see computer vision, 3
  • image identification
    •     see classification         image, 1
  • image representation, 8486, 306307
  • image segmentation, 322323
  • imbalanced data, 31
  • imputation, 14, 143144, 153154, 159161, 172177
  • in-distribution example, 96
  • in-distribution generalization, 96
  • inductive bias, 97
  • infinite-width neural network, 258
  • information retrieval, 149150
  • inherent uncertainty, 34, 109111
  • instance-based models, 83, 8688
  • integer encoding, 188
  • intercept
    •     see biases, 214
  • intra-attention
    •     see self-attention, 353
  • irreducible error
    •     see Bayes error, 111
  • irreducible uncertainty
    •     see inherent uncertainty, 109
  • irrelevant features, 97, 115
  • Isomap, 140

J

  • Jacobian matrix, 298

K

  • kernel, 244
    •     linear, 247
    •     polynomial, 247
    •     radial basis function (RBF), 244
  • k-fold cross-validation, 107
  • k-means method, 125127, 177
  • k-nearest neighbors
    •     see nearest neighbors, 87, 223
  • KneserNey smoothing, 263
  • knowledge distillation, 231

L

M

  • machine learning method, 13, 7
  • manifold, 139145, 152154
  • marginal likelihood, 258
  • Markov chain Monte Carlo (MCMC), 389392
  • Markov model, 259265
  • masked self-attention, 359360
  • matrix multiplication, 275276, 283
  • maximum a posteriori (MAP), 393
  • maximum likelihood estimation, 216
  • mean cross-entropy
    •     see negative log-likelihood (NLL), 49
  • mean squared error (MSE), 75, 91
  • measure uncertainty, 45, 106, 113114
  • measurements, 21, 34, 3738, 4351, 7278, 98, 106, 314
  • measures
    •     see measurements, 45
  • medical diagnosis, 5
  • memory-based models
    •     see instance-based models, 83
  • metrics
    •     see measurements, 45
  • Metropolis algorithm, 389391
  • mini-batch
    •     see batch, 291
  • missing completely at random, 173
  • missing data synthesis
    •     see imputation, 14
  • missing values, 3031
  • MNIST dataset, 151
  • model, 2, 8183
  • model capacity, 100103, 115116
  • model deployment, 3940
  • model evaluation
    •     see model inference, 11
  • model export
    •     see serialization, 39
  • model family, 83, 90
  • model inference, 11
  • model variance, 102
  • model-based reinforcement learning, 16
  • multilayer perceptron, 274276
  • multimodal distribution, 64, 173175
  • multinomial logistic regression
    •     see logistic regression, 217

N

O

  • object detection, 3, 11, 321322
  • objective function
    •     see cost function, 91
  • one-hot encoding, 189
  • online learning, 1819
  • optimization, 83, 92, 94
  • ordinal regression, 6465
  • outlier
    •     see anomaly, 13, 97
  • out-of-core learning, 18
  • out-of-distribution example, 96, 112, 172
  • out-of-distribution generalization, 9697, 348
  • overconfident model, 51
  • overfitting, 63, 97103

P

Q

  • quadratic loss
    •     see squared error loss, 91
  • quantiles, 187188

R

S

  • scaled exponential linear unit (SELU), 276
  • search engine
    •     see information retrieval, 149
  • self-attention, 353360
  • self-normalizing network, 276
  • self-supervised learning, 22
  • self-training, 18, 159, 175
  • semantic distance, 129
  • semantic features, 21, 97, 196
  • semantic hashing, 150
  • semi-supervised learning, 1718
  • sentiment analysis, 4
  • sequence generation
    •     see sequence modeling, 4
  • sequence modeling, 4, 259265, 324348, 352367
  • sequence-to-sequence (seq2seq), 341348, 360367
  • serialization, 39
  • Shapley additive explanations, 32, 70
  • signal processing, 194195
  • similarity-based models
    •     see instance-based models, 83
  • simulation, 1516
  • skip connection, 317
  • smoothing
    •     see regularization, 103
  • social bias, 113
  • softmax function, 218
  • speech recognition, 11, 323324
  • speech to text, 4
  • spurious features, 97, 115
  • squared error loss, 91
  • squeeze-and-excitation block, 317
  • standardization, 124125, 184185
  • steepest descent
    •     see gradient descent, 285
  • stochastic gradient descent, 291293
  • stochastic process
    •     see random function, 251
  • stop words, 132
  • structured data, 5, 910, 14, 30, 61, 65, 68, 89, 98, 123, 148149, 167, 175, 211
  • subword tokenization, 198
  • super-resolution imaging, 3
  • supervised learning, 911
  • support-vector machine, 242247
  • symbolic differentiation, 295

T

  • tabular data
    •     see structured data, 30
  • teacher forcing, 336
  • tensor
    •     see array, 193
  • term frequencyinverse document frequency (tfidf), 199201
  • term-document matrix, 199
  • test set, 38, 98, 112
  • text data
    •     see natural language processing, 3
  • text generation
    •     see sequence modeling, 4
  • text normalization, 196
  • text search, 149150
  • text tokenization, 197198
  • text translation, 3, 11, 360
  • training, 11
  • training round
    •     see epoch, 292
  • training set, 38, 98
  • transfer function
    •     see activation function, 272
  • transfer learning, 2021, 4145, 128129, 195196, 280, 334335
  • transformer, 348367
  • t-SNE, 131

U

V

  • validation set, 41, 98, 106, 112
  • vanilla recurrent network, 327329
  • vanishing and exploding gradient, 329
  • variable, 5
  • vector embedding, 189191
  • vector representation
    •     see latent features, 21
  • vision as inverse graphics, 397
  • Vision transformer (ViT), 358
  • vocabulary, 197

W

Copyright 2022