본문 바로가기

Python_Crawling/Crawling

Python - 한국기상청 도시별 현재 날씨 정보 분석 후 csv 저장

728x90

1. import module

import requests
from bs4 import BeautifulSoup as BS


2. Sample URL : http://www.weather.go.kr/weather/observation/currentweather.jsp


3. HTML Parsing Code

import requests
from bs4 import BeautifulSoup as BS

url = 'http://www.weather.go.kr/weather/observation/currentweather.jsp'

response = requests.get(url)

if response.status_code != 200:
print("%d 에러가 발생했습니다." % response.status_code)
quit()

soup = BS(response.content, 'html.parser')
table = soup.find('table', {'class' : 'table_develop3'})

data = []
for tr in table.find_all('tr'):
tds = list(tr.find_all('td'))
for td in tds:
if td.find('a'):
point = td.find('a').text
temperature = tds[5].text
humidity = tds[9].text
data.append([point, temperature, humidity])

with open('weather.csv', 'w') as file:
file.write('지역, 기온, 습도\n')
for i in data:
file.write('{0}, {1}, {2}\n'.format(i[0], i[1], i[2]))


4.. Code 풀이

- BeautifulSoup 객체 생성

url = 'http://www.weather.go.kr/weather/observation/currentweather.jsp'

response = requests.get(url)
soup = BS(response.content, 'html.parser')

soup.txt


- response 값이 정상인 200이 아닌경우 에러 출력

if response.status_code != 200:
print("%d 에러가 발생했습니다." % response.status_code)
quit()


- 정보 Table 구조 확인(크롬 개발자 모드)

- Class : table_develop3 파싱

table = soup.find('table', {'class' : 'table_develop3'})

table.txt


- table_develop3안에 가져올 지점 / 기온 / 습도 HTML 위치 구조 확인(크롬 개발자 모드)



- 지점 / 기온 / 습도를 가져오기 위해서 Data 리스트 구성 후 출력

data = []
for tr in table.find_all('tr'):
tds = list(tr.find_all('td'))
for td in tds:
if td.find('a'):
point = td.find('a').text
temperature = tds[5].text
humidity = tds[9].text
data.append([point, temperature, humidity])

[['서울', '18.1', '45'], ['백령도', '20.7', '32'], ['인천', '17.3', '59'], ['수원', '18.6', '57'], ['동두천', '19.0', '59'], ['파주', '18.5', '61'], ['강화', '19.2', '50'], ['양평', '16.5', '58'], ['이천', '18.0', '51'], ['북춘천', '17.1', '54'], ['북강릉', '23.0', '20'], ['울릉도', '19.3', '46'], ['속초', '23.9', '19'], ['철원', '18.5', '40'], ['대관령', '14.8', '38'], ['춘천', '16.9', '53'], ['강릉', '22.5', '24'], ['동해', '22.0', '23'], ['원주', '17.9', '41'], ['영월', '18.1', '48'], ['인제', '16.6', '51'], ['홍천', '17.6', '58'], ['태백', '18.5', '23'], ['정선군', '16.8', '41'], ['서산', '18.4', '65'], ['청주', '19.2', '55'], ['대전', '19.5', '57'], ['충주', '17.8', '49'], ['추풍령', '18.9', '33'], ['홍성', '19.4', '59'], ['제천', '17.3', '43'], ['보은', '17.9', '54'], ['천안', '18.0', '59'], ['보령', '19.5', '57'], ['부여', '19.3', '60'], ['금산', '19.4', '45'], ['전주', '20.2', '52'], ['광주', '18.6', '55'], ['목포', '19.3', '51'], ['여수', '18.7', '52'], ['흑산도', '16.8', '68'], ['군산', '19.0', '60'], ['완도', '18.6', '69'], ['고창', '19.5', '55'], ['순천', '18.3', '56'], ['진도(첨찰산)', '17.2', '57'], ['부안', '20.7', '48'], ['임실', '16.5', '51'], ['정읍', '20.2', '54'], ['남원', '17.2', '47'], ['장수', '18.7', '34'], ['고창군', '20.5', '47'], ['영광군', '19.8', '54'], ['순창군', '17.4', '57'], ['보성군', '18.3', '61'], ['강진군', '18.2', '56'], ['장흥', '18.3', '54'], ['해남', '19.4', '52'], ['고흥', '20.1', '41'], ['광양시', '20.4', '42'], ['진도군', '20.4', '52'], ['제주', '24.3', '28'], ['고산', '19.7', '69'], ['성산', '20.8', '29'], ['서귀포', '23.4', '35'], ['안동', '17.6', '50'], ['포항', '20.9', '33'], ['대구', '19.9', '30'], ['울산', '21.4', '35'], ['창원', '20.7', '39'], ['부산', '21.2', '33'], ['울진', '21.2', '31'], ['상주', '19.9', '52'], ['통영', '18.7', '60'], ['진주', '19.2', '51'], ['김해시', '22.5', '38'], ['북창원', '21.5', '37'], ['양산시', '20.7', '44'], ['의령군', '20.1', '40'], ['함양군', '18.2', '58'], ['봉화', '16.6', '46'], ['영주', '16.1', '52'], ['문경', '18.4', '55'], ['청송군', '18.6', '46'], ['영덕', '21.8', '31'], ['의성', '19.1', '40'], ['구미', '19.9', '43'], ['영천', '20.0', '33'], ['경주시', '20.9', '38'], ['거창', '17.5', '49'], ['합천', '20.7', '36'], ['밀양', '19.2', '32'], ['산청', '18.8', '45'], ['거제', '22.7', '35'], ['남해', '21.7', '41']]


Process finished with exit code 0


- CSV 파일 형식으로 저장

with open('weather.csv', 'w', encoding='utf-8') as file:
file.write('지역, 기온, 습도\n')
for i in data:
file.write('{0}, {1}, {2}\n'.format(i[0], i[1], i[2]))

weather.csv

반응형