๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
4. Backend & DB ๐Ÿ—„๏ธ/SQLD

[๋ฐ์ดํ„ฐ ๋ถ„์„] Python, Pandas Library์˜ mode() ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด์„œ ์ตœ๋นˆ๊ฐ’ ๊ตฌํ•˜๊ธฐ

by lxvxxu 2025. 10. 12.

pandas์˜ mode()๋Š” “์ตœ๋นˆ๊ฐ’(๊ฐ€์žฅ ์ž์ฃผ ๋“ฑ์žฅํ•œ ๊ฐ’)”์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • Series.mode() → ํ•ด๋‹น ์‹œ๋ฆฌ์ฆˆ์˜ ์ตœ๋นˆ๊ฐ’๋“ค์„ ๋‹ด์€ Series ๋ฐ˜ํ™˜
  • DataFrame.mode() → ๊ฐ ์—ด(๋˜๋Š” ํ–‰)๋ณ„ ์ตœ๋นˆ๊ฐ’๋“ค์„ ๋‹ด์€ DataFrame ๋ฐ˜ํ™˜
  • ์—ฌ๋Ÿฌ ๊ฐ’์ด ๋™์ผํ•œ ์ตœ๋นˆ ๋นˆ๋„๋ฅผ ๊ฐ€์ง€๋ฉด ๋ชจ๋‘ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, ๊ฒฐ๊ณผ๊ฐ€ ์—ฌ๋Ÿฌ ํ–‰์ด ๋  ์ˆ˜ ์žˆ์Œ).

Sereis ์‚ฌ์šฉ ์˜ˆ์‹œ

๊ธฐ๋ณธ : 

import pandas as pd

s = pd.Series([1, 2, 2, 3, 3, 3, 4])
s.mode()
# 0    3
# dtype: int64

 

์ตœ๋นˆ๊ฐ’์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๊ฒฝ์šฐ : 

s2 = pd.Series([1, 1, 2, 2, 3])
s2.mode()
# 0    1
# 1    2
# dtype: int64

 

NaN์ด ์žˆ๋Š” ๊ฒฝ์šฐ : 

s3 = pd.Series([1, 1, None, None, 2])
s3.mode()                 # dropna=True (๊ธฐ๋ณธ): NaN ์ œ์™ธ → [1]
s3.mode(dropna=False)     # NaN ํฌํ•จ: 1๊ณผ NaN์ด ๋‘˜ ๋‹ค ์ตœ๋นˆ๊ฐ’์ผ ์ˆ˜ ์žˆ์Œ

 

modeํ•จ์ˆ˜์˜ ์ธ์ž์—๋Š” dropna๊ฐ€ ๋“ค์–ด๊ฐ„๋‹ค. ์ด ๋•Œ ๊ธฐ๋ณธ์ ์œผ๋กœ 'dropna = True'๋กœ ์„ค์ •์ด ๋˜๋ฉฐ,

์ด๋Š” NaN์„ ์ œ์™ธํ•˜๊ณ  ์ตœ๋นˆ๊ฐ’์„ ๊ณ„์‚ฐํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋œปํ•œ๋‹ค.

 

NaN์ด๋ž€?

  • NaN์€ “Not a Number”์˜ ์ค„์ž„๋ง๋กœ, ์ˆซ์žํ˜• ๋ฐ์ดํ„ฐ์—์„œ “๊ฐ’์ด ์—†์Œ/์ •์˜๋˜์ง€ ์•Š์Œ/๊ณ„์‚ฐ ๋ถˆ๊ฐ€”๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํŠน๋ณ„ํ•œ ํ‘œ์‹์ž…๋‹ˆ๋‹ค.
  • ์ฃผ๋กœ ๋ถ€๋™์†Œ์ˆ˜์ (IEEE 754) ํ‘œ์ค€์— ๋”ฐ๋ผ ํ‘œํ˜„๋˜๋ฉฐ, ํŒŒ์ด์ฌ์˜ NumPy·pandas ๋“ฑ์—์„œ ๋„(๊ฒฐ์ธก) ๊ฐ’์„ ํ‘œ์‹œํ•  ๋•Œ ์ž์ฃผ ์”๋‹ˆ๋‹ค.

์™œ ์ƒ๊ธฐ๋‚˜?

  • 0/0, ∞-∞ ๊ฐ™์€ ์ •์˜ ๋ถˆ๊ฐ€๋Šฅํ•œ ๊ณ„์‚ฐ
  • ๋ฌธ์ž์—ด์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ ์‹คํŒจ
  • ํŒŒ์ผ/DB์—์„œ ๋ˆ„๋ฝ๋œ ๊ฐ’ ์ฝ๊ธฐ
  • ๋ฐ์ดํ„ฐ ๋ณ‘ํ•ฉ/ํ”ผ๋ฒ— ์‹œ ๋งค์นญ ๋ถˆ๊ฐ€๋กœ ๋นˆ์นธ์ด ์ƒ๊ธธ ๋•Œ

ํŠน์ง•(์ฃผ์˜์ )

  • NaN์€ ์ž๊ธฐ ์ž์‹ ๊ณผ๋„ ๊ฐ™์ง€ ์•Š์Šต๋‹ˆ๋‹ค: NaN != NaN → True