WordCloud

Notice

Recent Posts

Recent Comments

Link

GitHub

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

반전공자

WordCloud 본문

데이터분석/데테_인공지능

WordCloud

하연01 2021. 5. 7. 23:25

pip install wordcloud

from wordcloud import WordCloud
import matplotlib.pyplot as plt

text = open('speech.txt', encoding = 'ISO8859').read()

# 자동으로 text의 term을 추출하여 상대적인 출현 빈도수를 계산하고 array 형태로 이미지를 생성 
wordcloud = WordCloud().generate(text)
print(type(wordcloud))

# wordcloud.words_ 에 'dict' type으로 빈도수를 저장 
print(type(wordcloud.words_))
print(wordcloud.words_)

- 오바마의 스피치 연설을 담고 있는 파일인 speech.txt (경로이름)

< 결과 >

<class 'wordcloud.wordcloud.WordCloud'>
<class 'dict'>
{'us': 1.0, 'will': 0.782608695652174, 'nation': 0.6521739130434783, 'new': 0.4782608695652174, 'America': 0.43478260869565216, 'every': 0.34782608695652173, 'people': 0.34782608695652173, 'must': 0.34782608695652173, 'generation': 0.34782608695652173, 'less': 0.30434782608695654, 'work': 0.30434782608695654, 'world': 0.30434782608695654, 'let': 0.30434782608695654, 'today': 0.2608695652173913, 'now': 0.2608695652173913, 'time': 0.2608695652173913, 'common': 0.2608695652173913, 'day': 0.21739130434782608, 'know': 0.21739130434782608, 'spirit': 0.21739130434782608, 'God': 0.21739130434782608, 'seek': 0.21739130434782608, 'American': 0.21739130434782608, 'words': 0.17391304347826086, 'peace': 0.17391304347826086, 'crisis': 0.17391304347826086, 'far': 0.17391304347826086, 'hard': 0.17391304347826086, 'come': 0.17391304347826086, 'end': 0.17391304347826086, 'long': 0.17391304347826086, 'things': 0.17391304347826086, 'men': 0.17391304347826086, 'women': 0.17391304347826086, 'greater': 0.17391304347826086, 'meet': 0.17391304347826086, 'whether': 0.17391304347826086, 'government': 0.17391304347826086, 'power': 0.17391304347826086, 'father': 0.17391304347826086, 'moment': 0.17391304347826086, 'job': 0.17391304347826086, 'servi

- 스피치 연설문을 분석한 것을 가지고 워드 클라우드를 만들기

wordcloud = WordCloud(max_font_size = 70).generate(text)

plt.figure(figsize = (16,9)) # 그림 사이즈 지정 

plt.imshow(wordcloud, interpolation = "bilinear") # 글자의 배열 형태를 bilinear로 지정
plt.axis("off") # x, y축 나오지 않게 하기
plt.show()

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS

text = open('/speech.txt', encoding = 'ISO8859').read() # 읽어들이기 

alice_mask = np.array(Image.open("alice.jpg")) # 배경이 될 이미지 불러오기

# 배경은 흰색으로, 그 안에 최대 200개의 단어가 들어가도록 함, 틀은 alice_mask로
wc = WordCloud(background_color = "white", max_words = 200, mask = alice_mask, stopwords = STOPWORDS)
wc.generate(text)

plt.figure(figsize = (10, 10)) # 이미지 사이즈 지정
plt.imshow(wc, interpolation = 'bilinear') # 글자 배열 형태 지정
plt.axis("off") # x, y축을 없애고 출력하기

plt.show()

'데이터분석 > 데테_인공지능' 카테고리의 다른 글

Octave 기본 문법 (class(), Matrics) (0)	2021.06.04
AI - 한글 영화평 데이터 (0)	2021.05.09
Nominal Attribute (LabelEncoder, fit, transform) (0)	2021.04.28
Data Preprocessing (scikit-learn, Scaling(minimax_scale, fit_transform) (0)	2021.04.08
인공지능(AI) 기초 - 웹페이지 소스 긁어오기 (Scraping) (0)	2021.03.26

'데이터분석/데테_인공지능' Related Articles

반전공자

WordCloud 본문

WordCloud

'데이터분석 > 데테_인공지능' 카테고리의 다른 글

티스토리툴바