Utils

ttml.utils.add_infty_bins(thresholds, replace_biggest=False)[source]: For a list of arrays of thresholds, add np.infty to the end of every array. If replace_biggest=True, replace the last element by np.infty instead.

ttml.utils.convert_backend(A, backend)[source]: Convert the backend of tensor to target backend.

ttml.utils.convert_backend_cores(cores, backend)[source]: Convert the backend of a list of cores to target backend.

ttml.utils.dematricize(A, mode, shape)[source]: Undo matricization of A with respect to mode. Needs shape of original tensor.

ttml.utils.matricize(A, mode)[source]: Matricize tensor A with respect to mode

ttml.utils.merge_sum(idx, y, backend=None)[source]

Merge entries of y with identical entry in idx and sum result.

Returns new indices, merged y. This is copypasta from stackoverflow user perimosocordiae.

ttml.utils.predict_logit(logits, random=False)[source]

Turns logits into 0 / 1 predictions.

If random=True then sample from Bernouilli

ttml.utils.project_sorted(a, v)[source]: Returns closest entry in sorted array a for each entry in v

ttml.utils.random_idx(tt, N, backend=None)[source]: Generate N random indices for the tensor train tt

ttml.utils.random_normal(size, backend='numpy')[source]

Generate float64 standard normal distribution of specified size

TODO: This is unnecessary since the with_dtype wrapper update in autoray

ttml.utils.thresholds_from_data(X, num_thresh, min_samples=5, strategy='quantile')[source]: Bin each feature in at most ‘num_thresh’ bins. Compresses bins such that each bin contains at least min_samples samples.

ttml.utils.trim_ranks(dims, ranks)[source]

Return TT-rank to which TT can be exactly reduced

A tt-rank can never be more than the product of the dimensions on the left or right of the rank. Furthermore, any internal edge in the TT cannot have rank higher than the product of any two connected supercores. Ranks are iteratively reduced for each edge to satisfy these two requirements until the requirements are all satisfied.

ttml.utils.univariate_kmeans(X, n_clusters, prune_factor=10)[source]

Use k-means to find a set of n_clusters points minimizing the average minimum distance of X to the set.

prune_factor determines the minimum number of points each centroid should be associated to. A prune_factor of 10 means that centroids are ignored with size less than 1/10 times the average cluster size.