找到你要的答案

Q:java sentence splitting error

Q:java断句错误

I want to split sentences from a paragraph using java language. Consider the following sentence.

we decided to go to u.s.a, canada,africa etc... from our office. I have only rs.1 lakh. So i called my dad and asked some money. he said "No.I wont" and disconnected the phone

.

I used stanford tokenizer. Eventhough the sentence we decided to go to u.s.a, canada,africa etc... from our office is a single sentence, output showing that

we decided to go to u.s.a, canada,africa etc...

is one sentence and

from our office

is another sentence. Rest of the sentences are correctly splitting.

Please note : if the word is etc., instead of etc... it is working correctly.

Is it possible to tell the program that the words followings etc... are the continuation of the same sentence ? I tried some other sentence spliting tools also. But the result is same. Please help.

我想分裂的句子一段使用java语言。考虑下面的句子。

we decided to go to u。s。a, canada,africa etc。。。 从我们的办公室。 I have only rs。1 lakh。 So i called my dad and asked some money。 he said "No。I wont" and disconnected the phone

I used stanford tokenizer。 Eventhough the sentence we decided to go to u。s。a, canada,africa etc。。。 从我们的办公室 is a single sentence, output showing that

we decided to go to u。s。a, canada,africa etc。。。

是一句话

从我们的办公室

is another sentence。 Rest of the sentences are correctly splitting。

Please note : if the word is etc。, instead of etc。。。 it is working correctly。

Is it possible to tell the program that the words followings etc。。。 are the continuation of the same sentence ? I tried some other sentence spliting tools also。 But the result is same。 Please help。

answer1: 回答1:

Use the replace function to replace ... With something unique. *+&1 for example. Then split the string and then replace the unique part with ... Again.

Use the replace function to replace 。。。 With something unique。 *+&1 for example。 Then split the string and then replace the unique part with 。。。 Again。

java  regex  artificial-intelligence  stanford-nlp  text-segmentation