1 ๋ถ„ ์†Œ์š”

๐Ÿฝ๏ธ Raspberry pi 5 ์„ค์ •

๋ผ์ฆˆ๋ฒ ๋ฆฌํŒŒ์ด 5 ์‹œํ‚ค๊ณ  ํ™˜๊ฒฝ์„ค์ •์„ ์ •๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ๊ธ€์„ ๋„์ ์—ฌ๋ณธ๋‹ค.

๐Ÿšจ ๋Œ€์šฉ๋Ÿ‰ ํŒŒ์ผ ์ฒ˜๋ฆฌ๊ฐ€ Colab์—์„œ ์•ˆ๋œ๋‹ค๋Š” ๊ฒƒ์„ ๊นจ๋‹ซ๊ณ .. ๋กœ์ปฌ์—์„œ ๋Œ๋ ธ๋‹ค. ํŒŒ์ด์ฌ์œผ๋กœ!

๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ, ๋ฐ์ดํ„ฐ ํ•ฉ์น˜๊ธฐ (Data Preprocessing, Data Merging)


๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ๋Š” ํŒŒ์ผ ๋ช… ์•ˆ์— ์–ด๋–ค ํ˜•ํƒœ์ธ์ง€ ๋‹ค ๋ชจ์•„๋†จ์„ ํ…๋ฐ, ์ด๋ฅผ ํ•œ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ๋กœ ํ•ฉ์น˜๋Š” ๊ณผ์ •์ด๋‹ค.
๋จผ์ € ๋น„์ง€๋„ ํ•™์Šต, PCA๋ฅผ ์จ๋ณด๊ธฐ ์œ„ํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•ฉ์ณ๋ณด์ž.

import pandas as pd
import os
from sklearn.decomposition import PCA

# ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•  ๋นˆ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ
df_all = pd.DataFrame()

# ์—‘์…€ ํŒŒ์ผ ์ฝ๊ธฐ
df_info = pd.read_excel('/Volumes/{์—ด์—๋Œ€ํ•œ์ •๋ณดํŒŒ์ผ}.xlsx')

# ๋ชจ๋“  csv ํŒŒ์ผ์— ๋Œ€ํ•ด ๋ฐ˜๋ณต
for filename in os.listdir('/Volumes/Data'):
    if filename.endswith('.csv'):
        # CSV ํŒŒ์ผ ์ฝ๊ธฐ
        df_data = pd.read_csv(f'/Volumes/Data/{filename}', header=None)

        # ์ฒซ ํ–‰์— ์—‘์…€ ํŒŒ์ผ์˜ ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€
        df_data.columns = df_info.columns.tolist()
        # ํ•„์š” ์—†๋Š” ์—ด ์‚ญ์ œ
        df_data = df_data.drop(['ํ•„์š”์—†๋Š” ์†์„ฑ ์‚ญ์ œ'], axis=1)
        # "Time Data"๊ฐ€ ๋“ค์–ด๊ฐ„ ์—ด ์ œ๊ฑฐ
        time_columns = [col for col in df_data.columns if 'Time Data' in col]
        df_data = df_data.drop(columns=time_columns)
        
        # ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ
        print(df_data.head(3))

        # df_data๋ฅผ df_all์— ๋ถ™์ด๊ธฐ
        df_all = pd.concat([df_all, df_data])

# ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ
# PCA๋ฅผ ์ ์šฉํ•˜์—ฌ ์ฐจ์› ์ถ•์†Œ, 99%์˜ ๋ถ„์‚ฐ์„ ์œ ์ง€ํ•˜๋„๋ก ํ•จ = 99%์˜ ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๋„๋ก ํ•จ
# whiten=True๋กœ ์„ค์ •ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ •๊ทœํ™”(Normalization)ํ•จ
pca = PCA(n_components=0.99, whiten=True)
df_all_pca = pca.fit_transform(df_all)

# PCA ๊ฒฐ๊ณผ์˜ ์„ค๋ช…๋ ฅ ์ถœ๋ ฅ
explained_variance_ratio = pca.explained_variance_ratio_
print("PCA ๊ฒฐ๊ณผ์˜ ์„ค๋ช…๋ ฅ:")
print(explained_variance_ratio)

๋น„์ง€๋„ ํ•™์Šต์„ ์œ„ํ•ด PCA๋ฅผ ์ ์šฉํ•˜์˜€๋‹ค.

PCA ๊ฒฐ๊ณผ์˜ ์„ค๋ช…๋ ฅ:
[0.78608101 0.09318127 0.0513852  0.03209065 0.01289721 0.00484044
 0.00421308 0.0034297  0.00244788] 

๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” (Data Visualization)


๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”๊ฐ€ ๋จผ์ €์ธ ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜๋‹ค. ์‹œ๊ฐํ™”๊ฐ€ ์•ˆ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๋Œ๋ฆด ๋•Œ ์™œ ์ด๋Ÿฐ ๋ฐ์ดํ„ฐ๊ฐ€ ๋†’์€ ๋ถ„์‚ฐ์„ ๊ฐ€์ง€๋Š”์ง€.. ์•„๋‹ˆ๋ฉด ์™œ ์•ˆ๋‚˜์˜ค๋Š”์ง€, ์„ค๋ช…ํ•  ์ˆ˜ ๊ฐ€ ์—†์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋จผ์ € ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”๋ฅผ ํ•˜๋ ค๊ณ  ํ•œ๋‹ค.
16GB ๋˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์‹œ๊ฐํ™”๋ฅผ ํ•˜๋Š” ๊ฒƒ์€ ๋ง๋„ ์•ˆ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ์–ด๋–ป๊ฒŒ ํ•˜๋ฉด ์ค„์ผ๊นŒ?, ์–ด๋–ป๊ฒŒ ํ•˜๋ฉด ํšจ์œจ์ ์œผ๋กœ ์‹œ๊ฐ„์„