regex - How to get sentences from a paragraph with custom list of words in Python -


i trying read paragraph , capture sentences in words matching dynamic list of words.

the python pre-processing steps identify list of words. want use list of words , identify sentences in paragraph has @ least 1 of words list. identified sentences appended new variable.

input: "machine learning science of getting computers act without being explicitly programmed. machine learning pervasive today use dozens of times day without knowing it. many researchers think best way make progress towards human-level ai."

list of words: computer, researcher

output: machine learning science of getting computers act without being explicitly programmed.many researchers think best way make progress towards human-level ai.

what best way accomplish ?

based partially on this answer:

import nltk  tokenizer = nltk.data.load('tokenizers/punkt/english.pickle') text = "machine learning science of getting computers act without being explicitly programmed. machine learning pervasive today use dozens of times day without knowing it. many researchers think best way make progress towards human-level ai." word_list = ['computer', 'researcher'] output_list = []  sentence in tokenizer.tokenize(text):     word in word_list:         if word in sentence:             output_list.append(sentence)             break # useful when word_list large 

you need run nltk.download() beforehand , download punkt in models tab.


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -