본문 바로가기

Python_Intermediate/WordCloud

Chinese Character WordCloud

반응형

Python Code>

from wordcloud import WordCloud
from matplotlib import pyplot
from collections import Counter
from PIL import Image
import numpy

text = ''
with open("데이터위치", encoding="utf-8") as f:
text = f.read();

tmp = list(text)

hanja = []

ignore = [" ", "\n", ",", ".", "(", ")", "\U000f0703", "\ufeff"]

for item in tmp:
if item not in ignore:
hanja.append(item.strip())

count = Counter(hanja)

most = count.most_common(300)

tags = {}
for n, c in most:
tags[n] = c

img = Image.open("배경위치")
img_array = numpy.array(img)

wc = WordCloud(background_color='white', font_path="batang", max_font_size=250,
width=1200, height=800, scale=2.0, mask=img_array)

gen = wc.generate_from_frequencies(tags)

pyplot.figure()
pyplot.imshow(gen, interpolation='bilinear')
pyplot.axis("off")
wc.to_file("출력파일명")
pyplot.close()


O>


반응형

'Python_Intermediate > WordCloud' 카테고리의 다른 글

190429>Korea News keyword wordcloud  (0) 2019.04.29
190428>Korea News keyword wordcloud  (0) 2019.04.28
KoNLpy Korean WordCloud  (0) 2019.04.20
KoNLpy  (0) 2019.04.20
KoNLpy JAVA Environment Variable Error Dissolvent  (0) 2019.04.20