corpora_best_match.Rd
Returns a tibble of distances from the reference document for each corpus in a vector of corpora. This tibble is sorted in the order of increasing distance.
corpora_best_match( refDoc, corpora, metric = "cosine_similarity", model_name = "cb_ns_500_10" )
refDoc | character vector for reference document |
---|---|
corpora | character vector for corpora |
metric | character vector for metric used to calculate distance, optional (default : "cosine_similarity") |
model_name | character vector, optional (default : "cb_ns_500_10") |
tibble
coRPysprofiling::corpora_best_match("kitten meows", c("ice cream is yummy", "cat meowed", "dog barks", "The Hitchhiker's Guide to the Galaxy has become an international multi-media phenomenon"))#>#>#>#>#>#>#>#> # A tibble: 4 x 2 #> corpora metric #> <chr> <dbl> #> 1 cat meowed 0.344 #> 2 dog barks 0.404 #> 3 ice cream is yummy 0.835 #> 4 The Hitchhiker's Guide to the Galaxy has become an international multi~ 1.18