How To Save Words In A Csv File Tokenized From Articles With Sentence Id Number?
I am trying to extract all words from articles stored in CSV file and write sentence id number and containing words to a new CSV file. What I have tried so far, import pandas as pd
Solution 1:
Just need to iterate through the words and write a new line for each.
Going to be a bit unpredictable since you have commas as "words" as well - might want to consider another delimiter or strip the commas from your words list.
EDIT: This seems like a little cleaner way to do it.
import pandas as pd
from nltk.tokenize import sent_tokenize, word_tokenize
df = pd.read_csv(r"D:\data.csv", nrows=10)
sentences = tokenizer.tokenize(df['articles'[row]])
f = open('output.csv','w+')
stcNum = 1for stc in sentences:
for word in stc:
prntLine = ','if word == stc[0]:
prntLine = str(stcNum) + prntLine
prntLine = prntLine + word + '\r\n'
f.write(prntLine)
stcNum += 1
f.close()
output.csv:
1,The
,ultimate
,productivity
,hack
,is
,saying
,no
,.
2,Not
,doing
,something
,will
,always
,be
,faster
,than
,doing
,it
,.
3,This
,statement
,reminds
,me
,of
,the
,old
,computer
,programming
,saying
,, # <<< Most CSV parsers will see thisas3 empty columns
,“
,Remember
,that
,there
,is
,no
,code
,faster
,than
,no
,code
,.
,”
Post a Comment for "How To Save Words In A Csv File Tokenized From Articles With Sentence Id Number?"