R : k-means clustering- 군집화 ( 개념 및 예제 )

SW/R

R : k-means clustering- 군집화 ( 개념 및 예제 )

얇은생각 2019. 3. 14. 12:30

군집화 거리 계산

군집화에 대한 거리를 계산하는 공식은 아래와 같습니다. R에서 내장된 공식을 통해 k-means clustering 실습을 진행해보도록 하겠습니다.

clustering

R function: kmeans

 kmeans(x, centers, iter.max = 10, nstart = 1,
 algorithm = c("Hartigan-Wong", "Lloyd", "Forgy","MacQueen")) 

kmeans에 대한 매개 변수는 다음과 같습니다.

- x : 수치형 데이터 matrix

- centers : 몇 개의 그룹으로 나눌 것인가

- iter.max : 그룹 중심점을 찾기 위한 최대 반복 횟수

- nstart : 초기에 그룹 중심점을 임의로 잡을 때 몇 개의 점을 이용할것인가

- algorithm : 사용할 알고리즘

kmeans 실습

kmeans 알고리즘을 통해 군집화 실습 예제를 진행했습니다. 아래 예제와 결과들을 참고하세요.

 require(graphics)
 
 # create a 2-dimensional example 
 x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),           
            matrix(rnorm(100, mean = 1, sd = 0.3),
            ncol = 2))
 
 colnames(x) <- c("x", "y")
 
 cl <- kmeans(x, 2)
 
 cl # show clustering result
# K-means clustering with 2 clusters of sizes 50, 50

# Cluster means:
#             x           y
# 1 -0.01324708 -0.01495209
# 2  1.01950383  0.97088648

# Clustering vector:
#   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#  [42] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#  [83] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

# Within cluster sum of squares by cluster:
# [1] 9.725841 9.927489
#  (between_SS / total_SS =  72.2 %)

# Available components:

# [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
# [6] "betweenss"    "size"         "iter"         "ifault"      
 
 plot(x, col = cl$cluster)
 
 points(cl$centers, col = 1:2, pch = 8, cex=2)
 
 # random starts do help here with too many clusters 
 cl <- kmeans(x, 5, nstart = 25)
 
 plot(x, col = cl$cluster)
 
 points(cl$centers, col = 1:5, pch = 8)

clustering

저작자표시 (새창열림)

'SW > R' 카테고리의 다른 글

R : K-fold Cross Validation ( 개념 및 예제 ) (0)	2019.03.22
R : KNN classification (개념 및 예제) (0)	2019.03.15
R : 군집화-Clustering, 분류-Classification (개념 및 예제) (0)	2019.03.13
R : 로지스틱 회귀 ( 개념 및 예제 ) (0)	2019.03.12
R : 중선형 회귀 분석 (개념 및 예제) (0)	2019.03.11

현재글R : k-means clustering- 군집화 ( 개념 및 예제 )

쵸코쿠키의 연습장

R : k-means clustering- 군집화 ( 개념 및 예제 )

'SW > R' 카테고리의 다른 글

'SW/R'의 다른글

티스토리툴바

R : k-means clustering- 군집화 ( 개념 및 예제 )

'SW > R' 카테고리의 다른 글

'SW/R'의 다른글

관련글

티스토리툴바