謨ー蟄励�蜃コ迴セ謨ー繧呈焚縺医k
====================

繝励Ο繧ー繝ゥ繝�縺ョ讎りヲ�
----------------

縺ゅk繧キ繧ケ繝�Β縺檎函謌舌☆繧玖ェ崎ィシ繧ウ繝シ繝峨�6譯√�謨エ謨ー縺ァ縺吶€ゅ%縺ョ繧ウ繝シ繝峨↓迴セ繧後k蜷�。√�謨ー蟄�(Digit)縺ョ蛻�ク�′縺ゥ縺ョ遞句コヲ荳€讒倥↑縺ョ縺九r隱ソ縺ケ縺セ縺吶€�
螳滄圀縺ォ逕滓�縺輔l縺溘さ繝シ繝峨�荳€隕ァ縺九i縲∝推譯√�謨ー蟄励↓蛻�ァ」縺励※鬆サ蠎ヲ繧呈焚縺医∪縺吶€�
縺セ縺壹�逕滓�縺輔l縺歡ode縺ョ荳€隕ァ繧貞、画焚\ ``codes``\ 縺ォ蜑イ繧雁ス薙※縺セ縺吶€�

.. code:: ipython3

    #!python3
    #-*- coding:utf-8 -*-
    # '#'莉・髯阪�陦梧忰縺セ縺ァ縺後さ繝。繝ウ繝医↓縺ェ繧翫∪縺吶€ゑシ医�繝ュ繧ー繝ゥ繝�縺ィ縺励※縺ッ螳溯。後&繧後↑縺�シ�
    nlen=6 # 繧ウ繝シ繝峨�譯∵焚縺ッ6譯�
    codes=(
        227524,  463251,  702567,  601620,
        129458,  413239,  629380,  526093,
        261547,  666552,  853626,  513298,
        142167,  612906,  995697,  660500,
        954404,  651454,  566439,  730975,
        578967,  603424,  435636,  891667,
        757294,  325567,  131075,  757309, 
        198547,  542718,  511525,  357679,
        245280,
        # 2021/05/14
        40245,  314151, 689785, 246930, 429093,
        546479, 240262, 631261, 559379, 869925, 
        814588, 766660, 588608, 815828, 946364,
        542718, 710428, 725610, 863406, 250107, 
        629444, 865116, 86879, 
        # 012345 縺ョ繧医≧縺ォ0縺九i蟋九∪繧議ode縺ッ, 0繧堤怐縺�※譖ク縺上%縺ィ縺ォ縺励∪縺吶€�
    )

谺。縺ォ荳€縺、縺ョ code 縺ョ蜷�。√↓迴セ繧後k謨ー蟄励�蜃コ迴セ蝗樊焚繧呈焚縺医∪縺吶€�
荳€逡ェ荳九�譯√°繧蛾��分縺ォ縺ソ縺ヲ縺�″縺セ縺吶€�

.. code:: ipython3

    def count_digits(code,nlen=6): #縲€髢「謨ー縲€`count_digits`繧貞ョ夂セゥ縺励∪縺吶€�
        # 譛€螟ァ莠後▽縺ョ蠑墓焚繧呈戟縺。縺セ縺吶′縲∽コ後▽逶ョ縺ョ蠑墓焚`nlen`縺ッ逵∫払蜿ッ閭ス縺ァ縲�
        # 逵∫払縺輔l縺溷�エ蜷医�蛟、(譌「螳壼€、)縺ッ6縺ァ縺吶€�
        acc={} #縺薙l縺ォ謨ー蟄玲ッ弱�蜃コ迴セ蝗樊焚繧貞�繧後※縺�″縺セ縺吶€�
        code=code+10**nlen # 譛€荳贋ス阪′0縺ョ蝣エ蜷医r蜿悶j謇ア縺�◆繧√↓縲�10**nlen繧定カウ縺励※縺翫¥縲�
        # code =227524 縺ァ縺ゅl縺ー縲€code=1227524縺ィ縺吶k縺ィ縺�≧縺薙→縲�
        while code>=10:
            code, d=divmod(code, 10) #蝠� `code` 縺ィ菴吶j `d` 縺ォ蛻�ァ」縺励∪縺吶€�
            # code=1227524縲€=> code=122752,d=4
            acc[d]=acc.get(d,0)+1 
            # acc縺ョkey縺ォd縺後≠繧後�縺昴�蛟、縺ォ1繧定カウ縺励◆繧ゅ�縲√&繧ゅ↑縺代l縺ー 0+1繧誕cc[d]縺ォ險ュ螳壹☆繧九€�
        return acc

荳€隕ァ陦ィ\ ``codes``\ 縺ョ蜈ィ縺ヲ縺ョ\ ``code``\ 縺ォ迴セ繧後k謨ー蟄励↓鬆サ蠎ヲ繧呈焚縺医※縲∬カウ縺励※縺�″縺セ縺吶€�
邨先棡縺ョ蠎ヲ謨ー蛻�ク�。ィ\ ``acc``\ 繧帝未謨ー縺ョ蛟、縺ィ縺励※霑斐@縺セ縺吶€�

.. code:: ipython3

    def count_digits_in_list(codes,nlen):
        acc={}
        for code in codes: #codes縺ョ荳ュ縺ォ蜈・縺」縺ヲ縺�k蜈ィ縺ヲ縺ョcode縺ォ縺、縺�※
            count=count_digits(code,nlen) #繧ウ繝シ繝峨�荳ュ縺ョ謨ー蟄励�蜃コ迴セ陦ィ繧剃ス懊k縲�
            for d in count: #code縺ァ縺ョ謨ー蟄励�蜃コ迴セ謨ー繧誕cc縺ォ雜ウ縺励※陦後¥縲�
                acc[d]=acc.get(d,0)+count[d]
        return acc

``codes``\ 縺ョ荳ュ縺ョ謨ー蟄励�蠎ヲ謨ー蛻�。ィ繧貞魂蛻キ縺励€√ヲ繧ケ繝医げ繝ゥ繝�繧定。ィ遉コ縺輔○縺ヲ縺ソ縺セ縺吶€�

.. code:: ipython3

    import matplotlib.pyplot as pyplot #繧ー繝ゥ繝穂ス懈�縺ョ貅門y縺ョ縺翫∪縺倥↑縺�
    import numpy
    
    def main():
        counts=count_digits_in_list(codes,nlen) #code縺ョ荳€隕ァcodes縺九i謨ー蟄励�蜃コ迴セ謨ー縺ョ陦ィ繧剃ス懈�縺吶k縲�
        print("digit:  count")# 蜊ー蛻キ縺吶k陦ィ縺ョ繝ゥ繝吶Ν繧貞魂蛻キ縺吶k縲�
        for d in sorted(counts): # 蜃コ迴セ陦ィ縺ォ縺ゅi繧上l繧区焚蟄励r謨エ蛻励&縺帙◆鬆�分縺ォ縲�
            print (d,":",counts[d]) # 謨ー蟄�(d)縺ィ縺昴�蜃コ迴セ謨ー(count[d])繧貞魂蛻キ縺吶k縲�
        # print("average:",len(codes)*nlen/10) #謨ー蟄励�蜃コ迴セ謨ー縺ョ蟷ウ蝮�€、繧貞魂蛻キ縺吶k縲�
        print("average:", sum(counts.values())/len(counts)) # 6*len(codes)/10
        print("std. dev:{:4.1f}".format(numpy.std(list(counts.values()))))
        pyplot.bar(counts.keys(),counts.values()) # bar繧ー繝ゥ繝輔r蜊ー蛻キ縺吶k縲�
        pyplot.draw() #繧ー繝ゥ繝穂ス懈�縺ョ縺翫∪縺倥↑縺�
        pyplot.show() #繧ー繝ゥ繝穂ス懈�縺ョ譛€蠕後�縺翫∪縺倥↑縺�. Jupyter Notebook縺ァ縺ッ荳崎ヲ�


.. code:: ipython3

    #繝輔ぃ繧、繝ォ繧偵さ繝槭Φ繝峨→縺励※螳溯。後☆繧九◆繧√�縺翫∪縺倥↑縺�
    if __name__ == "__main__":
        main()


.. parsed-literal::

    digit:  count
    0 : 28
    1 : 30
    2 : 37
    3 : 23
    4 : 36
    5 : 45
    6 : 50
    7 : 30
    8 : 27
    9 : 30
    average: 33.6
    std. dev: 8.0



.. image:: CountDigits_files/CountDigits_8_1.png


pandas/DataFrame
----------------

Data Analysis蛻�㍽縺ァ譛€霑代h縺丈スソ繧上l繧汽ataFrame繧ェ繝悶ず繧ァ繧ッ繝医r菴ソ縺�→縲�
遏ュ縺��繝ュ繧ー繝ゥ繝�縺ァ繝��繧ソ縺ョ繝ェ繧ケ繝医°繧牙コヲ謨ー蛻�。ィ繧貞叙蜃コ縺励€√∪縺溘ヲ繧ケ繝医げ繝ゥ繝�繧定。ィ遉コ縺輔○繧九%縺ィ縺後〒縺阪∪縺吶€�

.. code:: ipython3

    import pandas # DataFrame縺御スソ縺医k繧医≧縺ォpandas繧偵う繝ウ繝昴�繝医@縺ヲ鄂ョ縺阪∪縺吶€�
    
    def split_digits(code,nlen=6):
        acc=[] # 縺ゅi繧上l縺滓焚蟄励r鬆�分縺ォ菫晏ュ倥☆繧九€�
        code=code+10**nlen # 譛€荳贋ス阪′0縺ョ蝣エ蜷医r蜿悶j謇ア縺�◆繧�
        while code>=10:
            code,d=divmod(code,10)
            acc.insert(0,d)
        return acc
    
    def split_codes():
        acc=[]
        for code in codes:
            acc +=split_digits(code)
        return acc

.. code:: ipython3

    # codes縺ョ繝��繧ソ繧剃ク€譯√�謨ー蟄励↓蛻�ァ」縺励€√Μ繧ケ繝医r縺、縺上j縺セ縺�
    
    df=pandas.DataFrame(split_codes()).value_counts(sort=False)
    print(df)
    print("average:",numpy.average(df))
    df.plot.bar()


.. parsed-literal::

    0    28
    1    30
    2    37
    3    23
    4    36
    5    45
    6    50
    7    30
    8    27
    9    30
    dtype: int64
    average: 33.6




.. parsed-literal::

    <AxesSubplot:xlabel='0'>




.. image:: CountDigits_files/CountDigits_11_2.png


matplotlib.pyplot
-----------------

matplotlib.pyplot繧剃スソ縺」縺ヲ縲√ヲ繧ケ繝医げ繝ゥ繝�縺ョ陦ィ遉コ縺ィ蠎ヲ謨ー蛻�ク�r豎ゅa繧九%縺ィ繧ょ庄閭ス縺ァ縺吶€�

縺セ縺壹ヲ繧ケ繝医げ繝ゥ繝�繧剃ス懈�縺励∪縺吶€�

.. code:: ipython3

    counts, digits, chart = pyplot.hist(
        split_codes(), 
        bins=range(0,11)
    )



.. image:: CountDigits_files/CountDigits_13_0.png


matplotlib.hist()髢「謨ー縺ッ縲√ン繝ウ豈弱�險域焚蛟、(count)縲|in縺ョ遶ッ縺ョ蛟、(digits)縺昴@縺ヲ縲√ヲ繧ケ繝医げ繝ゥ繝�縺ョ繧ェ繝悶ず繧ァ繧ッ繝医r蛟、縺ィ縺励※霑斐@縺ヲ縺阪∪縺吶€ゅ〒縺吶°繧峨€∵ャ。縺ョ繧医≧縺ェ繧ケ繧ッ繝ェ繝励ヨ縺ァ縲∝推繝薙Φ縺ョ繝ゥ繝吶Ν縺ィ菫よ焚蛟、繧貞魂蛻キ縺励∪縺吶€Eigits縺ョ隕∫エ�謨ー縺ッcounts縺ョ隕∫エ�謨ー繧医j1螟壹>縺薙→縺ォ豕ィ諢上′蠢�ヲ√〒縺吶€�

.. code:: ipython3

    for p in (zip(digits,counts)):
        print(p)
    print("average:", sum(counts)/len(counts))
    print(len(digits), ">",len(counts))


.. parsed-literal::

    (0, 28.0)
    (1, 30.0)
    (2, 37.0)
    (3, 23.0)
    (4, 36.0)
    (5, 45.0)
    (6, 50.0)
    (7, 30.0)
    (8, 27.0)
    (9, 30.0)
    average: 33.6
    11 > 10


谺。縺ォ縲∵焚蟄励�蜃コ迴セ蝗樊焚縺ョ蛻�ク�r繝励Ο繝�ヨ縺励※縺ソ縺セ縺吶€や€�5窶昴♀繧医�窶�6窶昴�蜃コ迴セ蝗樊焚縺檎ェ∝�縺励※縺�k縺薙→縺�
隕九※蜿悶l縺セ縺吶€�

.. code:: ipython3

    pyplot.hist(
        counts, 
        histtype="stepfilled",
        density=False,
        align="left",
        bins=range(34-3*8,34+3*8,2),
    )
    pyplot.gca().set_xlabel("count")
    pyplot.gca().set_ylabel(u"occurence");




.. parsed-literal::

    Text(0, 0.5, 'occurence')




.. image:: CountDigits_files/CountDigits_17_1.png