Tech이야기~!
welcom 자세히보기

Data Science

[Python] 시청률 랭킹 Top10 2011-2020

Enhold 2020. 2. 18. 01:37

 

 

바 차트레이싱 한국 시청률 랭킹 2011-2020

Race with Matplotlib

If we have data (big data), we can use it to visualize data.
If you have a variety of formats, Python is a powerful weapon for data preprocessing.

 

Let’s Code

Matplotlib’s style defaults are designed for many common situations.

I`m crawling data from Website about "Korea TV viewing rate" in 2011-2020(Every Monday)
Also, I`m missing data type(group) when it is mess in color.

from bs4 import BeautifulSoup<br>
import requests<br>
import csv<br>
from datetime import datetime<br>
import pandas<br>
import pandas as pd<br>
import itertools

Read the Data set(csv file) with pandas. We only need 4 columns to work with 'name', 'group', 'year', 'value'.

url = 'output1.csv'
df = pd.read_csv(url,usecols=['name', 'group', 'year', 'value'],encoding='CP949')
df.head(3)

 

 

Color, Labels

Next, let’s add values, group labels and colors based on groups. We’ll user colors and group_lk to add color to the bars.

# colors 
def transform_color(color, amount = 0.5):
    try:
        c = mc.cnames[color]
    except:
        c = color
        c = colorsys.rgb_to_hls(*mc.to_rgb(c))
    return colorsys.hls_to_rgb(c[0], 1 - amount * (1 - c[1]), c[2])
    
all_names = df['name'].unique().tolist()
random_hex_colors = []
group_lk = df.set_index('name')['group'].to_dict()

group_lk is mapping between name and group values.

for i in range(len(all_names)):
    random_hex_colors.append('#' + '%06X' % randint(0, 0xFFFFFF))
rgb_colors = [transform_color(i, 1) for i in random_hex_colors]
rgb_colors_opacity = [rgb_colors[x] + (0.825,) for x in range(len(rgb_colors))]
rgb_colors_dark = [transform_color(i, 1.12) for i in random_hex_colors]

 

Basic chart

Now, let’s plot a basic bar chart. We start by creating a figure and an axes. Then, we use ax.barh(x, y) to draw horizontal barchart.

fig, ax = plt.subplots(figsize = (36, 20))
num_of_elements = 10

Polish Style

We need to style following items:

  • Text: Update font sizes, color, orientation
  • Axis: Move X-axis to top, add color & subtitle
  • Grid: Add lines behind bars
  • Format: comma separated values and axes tickers
  • Add title, credits, gutter space
  • Remove: box frame, y-axis labels
def draw_barchart(current_year):
    dff = df[df['year'].eq(current_year)].sort_values(by='value', ascending=True).tail(10)
    ax.clear()
    
    normal_colors = dict(zip(df['name'].unique(), rgb_colors_opacity))
    
    ax.barh(dff['name'], dff['value'], color = [normal_colors[x] for x in dff['name']])
    
    dx = dff['value'].max() / 200
    for i, (value, name) in enumerate(zip(dff['value'], dff['name'])):
        ax.text(value-dx, i,name,size=14, weight=600, ha='right', va='bottom')
        ax.text(value-dx, i-.25, group_lk[name], size=10, color='#444444', ha='right', va='baseline')
        ax.text(value+dx, i, f'{value:,.0f}',  size=14, ha='left',  va='center')
    ax.text(1, 0.4, current_year, transform=ax.transAxes, color='#777777', size=46, ha='right', weight=800)
    ax.text(0, 1.06, 'Analysis: metropolitan area, household, unit:%', transform=ax.transAxes, size=12, color='#777777')
    ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
    ax.xaxis.set_ticks_position('top')
    ax.tick_params(axis='x', colors='#777777', labelsize=12)
    ax.set_yticks([])
    ax.margins(0, 0.01)
    ax.grid(which='major', axis='x', linestyle='-')
    ax.set_axisbelow(True)
    ax.text(0, 1.15, 'The most populous TV Program in Korea from 2011 to 2020',
            transform=ax.transAxes, size=24, weight=600, ha='left', va='top')
 
    plt.box(False)
    
draw_barchart()

Animate Race

To animate the race, we will use FuncAnimation from matplotlib.animation. We also use frames, this argument accepts on what values you want to run draw_barchart – we’ll run from year 2011 to 2020

font_name = font_manager.FontProperties(fname="c:/Windows/Fonts/malgun.ttf").get_name() rc('font', family=font_name) fig, ax = plt.subplots(figsize=(15, 8)) animator = animation.FuncAnimation(fig, draw_barchart, frames=frames_list) HTML(animator.to_jshtml()) # HTML(animator.to_html5_video( )) # animator.save( 'test.mp4') plt.show()

 

 

You could save the animator object to a video/gif or play within the notebook.

Bonus_1: UTF-8 Encoding

2 ways encoding font
- install matasploit font
- Using Font_manager

font_name = font_manager.FontProperties(fname="c:/Windows/Fonts/malgun.ttf").get_name()
rc('font', family=font_name)

Gighub

https://github.com/figureking/Bigdata/blob/master/Bar%20Chart%20Race%20in%20kr%20program.ipynb