분절 복원
Natural language processing is one of biggest streams in artificial intelligence, and it becomes very popular after seq2seq's invention.▁Natural ▁language ▁processing ▁is ▁one ▁of ▁biggest ▁streams ▁in ▁artificial ▁intelligence , ▁and ▁it ▁becomes ▁very ▁popular ▁after ▁seq2seq 's ▁invention .▁▁Natural ▁▁language ▁▁processing ▁▁is ▁▁one ▁▁of ▁▁biggest ▁▁streams ▁▁in ▁▁artificial ▁▁intelligence ▁, ▁▁and ▁▁it ▁▁becomes ▁▁very ▁▁popular ▁▁after ▁▁se q 2 se q ▁'s ▁▁invention ▁.▁▁Natural▁▁language▁▁processing▁▁is▁▁one▁▁of▁▁biggest▁▁streams▁▁in▁▁artificial▁▁intelligence▁,▁▁and▁▁it▁▁becomes▁▁very▁▁popular▁▁after▁▁seq2seq▁'s▁▁invention▁. Natural language processing is one of biggest streams in artificial intelligence▁, and it becomes very popular after seq2seq▁'s invention▁.Natural language processing is one of biggest streams in artificial intelligence, and it becomes very popular after seq2seq's invention.분절 후처리
import sys
STR = '▁'
if __name__ == "__main__":
ref_fn = sys.argv[1]
f = open(ref_fn, 'r')
for ref in f:
ref_tokens = ref.strip().split(' ')
input_line = sys.stdin.readline().strip()
if input_line != "":
tokens = input_line.split(' ')
idx = 0
buf = []
# We assume that stdin has more tokens than reference input.
for ref_token in ref_tokens:
tmp_buf = []
while idx < len(tokens):
if tokens[idx].strip() == '':
idx += 1
continue
tmp_buf += [tokens[idx]]
idx += 1
if ''.join(tmp_buf) == ref_token:
break
if len(tmp_buf) > 0:
buf += [STR + tmp_buf[0].strip()] + tmp_buf[1:]
sys.stdout.write(' '.join(buf) + '\n')
else:
sys.stdout.write('\n')
f.close()분절 복원 예제
Last updated