Utils
- ttml.utils.add_infty_bins(thresholds, replace_biggest=False)[source]
For a list of arrays of thresholds, add np.infty to the end of every array. If replace_biggest=True, replace the last element by np.infty instead.
- ttml.utils.convert_backend_cores(cores, backend)[source]
Convert the backend of a list of cores to target backend.
- ttml.utils.dematricize(A, mode, shape)[source]
Undo matricization of
Awith respect tomode. Needsshapeof original tensor.
- ttml.utils.merge_sum(idx, y, backend=None)[source]
Merge entries of y with identical entry in idx and sum result.
Returns new indices, merged y. This is copypasta from stackoverflow user perimosocordiae.
- ttml.utils.predict_logit(logits, random=False)[source]
Turns logits into 0 / 1 predictions.
If random=True then sample from Bernouilli
- ttml.utils.project_sorted(a, v)[source]
Returns closest entry in sorted array a for each entry in v
- ttml.utils.random_idx(tt, N, backend=None)[source]
Generate N random indices for the tensor train tt
- ttml.utils.random_normal(size, backend='numpy')[source]
Generate float64 standard normal distribution of specified size
TODO: This is unnecessary since the with_dtype wrapper update in autoray
- ttml.utils.thresholds_from_data(X, num_thresh, min_samples=5, strategy='quantile')[source]
Bin each feature in at most ‘num_thresh’ bins. Compresses bins such that each bin contains at least min_samples samples.
- ttml.utils.trim_ranks(dims, ranks)[source]
Return TT-rank to which TT can be exactly reduced
A tt-rank can never be more than the product of the dimensions on the left or right of the rank. Furthermore, any internal edge in the TT cannot have rank higher than the product of any two connected supercores. Ranks are iteratively reduced for each edge to satisfy these two requirements until the requirements are all satisfied.
- ttml.utils.univariate_kmeans(X, n_clusters, prune_factor=10)[source]
Use k-means to find a set of n_clusters points minimizing the average minimum distance of X to the set.
prune_factor determines the minimum number of points each centroid should be associated to. A prune_factor of 10 means that centroids are ignored with size less than 1/10 times the average cluster size.