2024/12 4

[데이터 마이닝] Ch.11 - Contextual and Collective outliers를 Mining하기, 고차원 데이터에서 Outlier Detection

1. Basic concept - outlier은 normal한 패턴을 따르지 않는 것 2. Statiscal approaches: 통계적 기법으로 outlier 찾기(box plot, normal distribution, t-distribution, distance-based, realistic k로 local outlier)3. Proximity-based approaches4. Reconstruction-based approaches - normal한 방법을 대표하는 모델 (succint represention)으로 복원시켰을 때, normal한 것은 모델을 따르고, outlier은 모델을 따르지 않음.5. Clustering and Classification based approaches - 클러..

[데이터 마이닝] Ch11. Outlier - Reconstruction-based, Clustering and Classification-based Outlier detection

1. Basic concept2. Statiscal approaches3. Proximity-based approaches4. Reconstruction-based approaches - Matrix factorization based method 5. Clustering and Classification based approaches - Clustering-based approaches - Classification-based approaches6. Mining contextual and collective outliers7. Outlier detection in high-dimensional data  어떠한 data를 설명하기 위한 model(Succinct representation)을 오차를 최..

[데이터 마이닝] Ch11. Outliers - Statiscal, Proximity-based outlier detection

1. Basic concept2. Statiscal approaches - Parametric methods - Nonparametric methods3. Proximity-based approaches4. Reconstruction-based approaches5. Clustering and Classification based approaches6. Mining contextual and collective outliers7. Outlier detection in high-dimensional data  주어진 data set의 분포를 잘 설명하는 생성 모델을 학습하고, 그 모델에서 낮은 확률 영역(low-probability regions)에 속하는 objects를 outliers로 식별한다. pa..

[데이터 마이닝] Ch11. Outlier - Basic concept

Outlier(이상치)를 알아보고, Outlier을 detection(감지)하는 방법론을 배워보자.1. Basic concept2. Statiscal approaches3. Proximity-based approaches4. Reconstruction-based approaches5. Clustering and Classification based approaches6. Mining contextual and collective outliers7. Outlier detection in high-dimensional data대부분의 transaction들은 normal 하지만, 몇몇은 매우 anormal(일반적이지 않다). 함.Outlier: data의 분포(expectation)을 따르지 않는 소수의 d..