Pandas - 어린이집 시설 현황 분석(Excel +그래프)

1. 공공기관 Data를 사용하여 분석

- DB : http://www.index.go.kr/potal/main/EachDtlPageDetail.do?idx_cd=1583

2. import 모듈

from print_df import print_df
from pandas import ExcelFile
from pandas import DataFrame
import matplotlib.pyplot as plt
import datetime as dt

3. Pandas 분석

- 엑셀파일 읽기

xls_file = ExcelFile('children.xlsx')

- sheet 표시

sheet_names = xls_file.sheet_names

['데이터', '메타정보']

Process finished with exit code 0

- 첫 번째 sheet를 DataFrame 화

xls_file = ExcelFile('children.xlsx')

sheet_names = xls_file.sheet_names

df = xls_file.parse(sheet_names[0])
print_df(df)

+---+--------------+-------+-------+-------+

| | 종류 | 2015 | 2016 | 2017 |

+---+--------------+-------+-------+-------+

| 0 | 전국(계) | 42517 | 41084 | 40238 |

| 1 | 국공립 | 2629 | 2859 | 3157 |

| 2 | 사회복지법인 | 1414 | 1402 | 1392 |

| 3 | 법인 단체 등 | 834 | 804 | 771 |

| 4 | 민간 | 14626 | 14316 | 14045 |

| 5 | 가정 | 22074 | 20598 | 19656 |

| 6 | 협동 | 155 | 157 | 164 |

| 7 | 직장 | 785 | 948 | 1053 |

+---+--------------+-------+-------+-------+

Process finished with exit code 0

- 종류에 대한 열을 리스트로 추출

df_list = list(df['종류'])
print_df(df_list)

['전국(계)', '국공립', '사회복지법인', '법인\xa0단체\xa0등', '민간', '가정', '협동', '직장']

Process finished with exit code 0

- 인덱스 딕셔너리 구조 생성

index_dict = {}
for i, v in enumerate(df_list):
    index_dict[i] = v

print_df(index_dict)

{0: '전국(계)', 1: '국공립', 2: '사회복지법인', 3: '법인\xa0단체\xa0등', 4: '민간', 5: '가정', 6: '협동', 7: '직장'}

Process finished with exit code 0

- 인덱스 이름 변경 및 필요없는 열 / 행 삭제

df.drop('종류', axis=1, inplace=True)
df.rename(index=index_dict, inplace=True)
df.drop(['전국(계)'], inplace=True)

print_df(df)

+--------------+-------+-------+-------+

| | 2015 | 2016 | 2017 |

+--------------+-------+-------+-------+

| 국공립 | 2629 | 2859 | 3157 |

| 사회복지법인 | 1414 | 1402 | 1392 |

| 법인 단체 등 | 834 | 804 | 771 |

| 민간 | 14626 | 14316 | 14045 |

| 가정 | 22074 | 20598 | 19656 |

| 협동 | 155 | 157 | 164 |

| 직장 | 785 | 948 | 1053 |

+--------------+-------+-------+-------+

Process finished with exit code 0

- 요약 정보 파일 DataFream 생성

des = df.describe()

- 현재 시각 기준 엑셀파일 이름 생성

now_time = dt.datetime.now().strftime('%y%m%d_%H%M%S')

- 요약 정보 Excel 파일 생성화

filename = "children_house_describe_" + now_time + ".xlsx"
des.to_excel(filename, encoding='utf-8', sheet_name='요약정보', na_rep='NaN',
            index=True, index_label='항목', header=['2015년','2016년','2017년'])

항목	2015년	2016년	2017년
count	7	7	7
mean	6073.857	5869.143	5748.286
std	8690.857	8162.994	7809.773
min	155	157	164
25%	809.5	876	912
50%	1414	1402	1392
75%	8627.5	8587.5	8601
max	22074	20598	19656

children_house_describe_190515_140443.xlsx

- 새로 막대 그래프 생성

plt.rcParams["font.family"] = 'NanumGothic'
plt.rcParams["font.size"] = 10
plt.rcParams["figure.figsize"] = (14, 10)

plt.figure()
df.plot.bar()
plt.grid()
plt.title("2054~2018년 전국 어린이집")
plt.legend()
plt.ylabel("어린이집 수")
plt.savefig('children_house_describe_.png', dpi=200)
plt.close()

저작자표시 비영리 변경금지 (새창열림)

'Python_Intermediate > Pandas' 카테고리의 다른 글

190517 21:42> Naver 실시간 검색어 20위 (0)	2019.05.17
Python Pandas 박스오피스 180516 순위 분석 (0)	2019.05.17
Pandas - 교통 사고 사망 / 사고 / 부상 그래프 분석 (0)	2019.05.15
Pandas - 서울시 자치구 년도별 CCTV 설치 현황 (0)	2019.05.14
Python Pandas 박스오피스 180507 순위 분석 (0)	2019.05.08

오늘 코딩 내일 디버깅

Pandas - 어린이집 시설 현황 분석(Excel +그래프)

'Python_Intermediate > Pandas' 카테고리의 다른 글

티스토리툴바

Pandas - 어린이집 시설 현황 분석(Excel +그래프)

'Python_Intermediate > Pandas' 카테고리의 다른 글

'Python_Intermediate/Pandas' Related Articles

티스토리툴바