Converts the last column of the data frame to a bag of words and return it along with other columns of the data frame.

bow(df)

Arguments

df

a data frame with the last column of raw text

Value

a data frame which consists of the n-1 first columns of the input data frame as its n-1 first columns, plus a bag of words out the input data frame in its following numerous columns.

Examples

df <- data.frame(
    url = c("https://www.cnn.com/world",
            "https://www.foxnews.com/world",
            "https://www.cbc.ca/news/world"),
    url_id = c("cnn1","foxnews1","cbc1"),
    text = c("Instagram has a faster chance of reaching me than CNN",
             "I would appear on Fox News more easily than I would NPR.",
             "CBC has a very important mandate to bind Canada together
             in both official languages, tell local stories, and make
             sure we have a sense of our strength, our culture, our stories."))

 bow(df)
#>                             url   url_id
#> 1     https://www.cnn.com/world     cnn1
#> 2 https://www.foxnews.com/world foxnews1
#> 3 https://www.cbc.ca/news/world     cbc1
#>                                                                                                                                                                                                            text
#> 1                                                                                                                                                         Instagram has a faster chance of reaching me than CNN
#> 2                                                                                                                                                      I would appear on Fox News more easily than I would NPR.
#> 3 CBC has a very important mandate to bind Canada together\n             in both official languages, tell local stories, and make\n             sure we have a sense of our strength, our culture, our stories.
#>   appear bind canada cbc chance cnn culture easily faster fox important
#> 1      0    0      0   0      1   1       0      0      1   0         0
#> 2      1    0      0   0      0   0       0      1      0   1         0
#> 3      0    1      1   1      0   0       1      0      0   0         1
#>   instagram languages local make mandate news npr official reaching sense
#> 1         1         0     0    0       0    0   0        0        1     0
#> 2         0         0     0    0       0    1   1        0        0     0
#> 3         0         1     1    1       1    0   0        1        0     1
#>   stories strength sure tell together
#> 1       0        0    0    0        0
#> 2       0        0    0    0        0
#> 3       2        1    1    1        1
 "-------------------------------  ----------  ------------------------------
             url                   url_id             text
 -------------------------------  ----------  ------------------------------
  https://www.cnn.com/world         cnn1       Instagram has a faster ...
  https://www.foxnews.com/world     foxnew1    I would appear on Fox ...
  https://www.cbc.ca/news/world     cbc1       CBC has a very important ...

  -------- ------ -------     ------ --------- ------
  appear   bind   canada  ...  tell   twitter   want
  -------- ------ -------     ------ --------- ------
   0        0       0    ...     0       1        1
   1        0       0    ...     0       0        0
   0        1       1    ...     1       0        0"
#> [1] "-------------------------------  ----------  ------------------------------\n             url                   url_id             text\n -------------------------------  ----------  ------------------------------\n  https://www.cnn.com/world         cnn1       Instagram has a faster ...\n  https://www.foxnews.com/world     foxnew1    I would appear on Fox ...\n  https://www.cbc.ca/news/world     cbc1       CBC has a very important ...\n\n  -------- ------ -------     ------ --------- ------\n  appear   bind   canada  ...  tell\t twitter   want\n  -------- ------ -------     ------ --------- ------\n   0        0       0    ...     0       1        1\n   1        0       0    ...     0       0        0\n   0        1       1    ...     1       0        0"