SlideShare a Scribd company logo
International Journal of Electrical and Computer Engineering (IJECE)
Vol. 8, No. 6, December 2018, pp. 4829~4835
ISSN: 2088-8708, DOI: 10.11591/ijece.v8i6.pp4829-4835  4829
Journal homepage: http://iaescore.com/journals/index.php/IJECE
Advanced SOM & K Mean Method for Load Curve Clustering
Phan Thi Thanh Binh1
, Trong Nghia Le2
, Nui Pham Xuan3
1,3
Department of Electrical and Electronics Engineering, HCMC University of Technology, Vietnam
2
Department of Electrical and Electronics Engineering, HCMC University of Technology and Education, Vietnam
Article Info ABSTRACT
Article history:
Received Feb 1, 2018
Revised Jun 30, 2018
Accepted Jul 22, 2018
From the load curve classification for one customer, the main features such
as the seasonal factors, the weekday factors influencing on the electricity
consumption may be extracted. By this way some utilities can make decision
on the tariff by seasons or by day in week. The popular clustering techniques
are the SOM & K-mean or Fuzzy K-mean. SOM &Kmean is a prominent
approach for clustering with a two-level approach: first, the data set will be
clustered using the SOM and in the second level, the SOM will be clustered
by K-mean. In the first level, two training algorithms were examined:
sequential and batch training. For the second level, the K-mean has the
results that are strongly depended on the initial values of the centers. To
overcome this, this paper used the subtractive clustering approach proposed
by Chiu in 1994 to determine the centers. Because the effective radius in
Chiu’s method has some influence on the number of centers, the paper
applied the PSO technique to find the optimum radius. To valid the proposed
approach, the test on well-known data samples is carried out. The
applications for daily load curves of one Southern utility are presented.
Keyword:
Cluster analysis
K-mean
PSO
SOM
Subtractive clustering
Copyright © 2018 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Phan Thi Thanh Binh,
Department of Electrical and Electronics Engineering,
HCMC University of Technology,
268 Ly Thuong Kiet steeet, 10 district, Ho chi minh city, Vietnam.
Email: pttbinh@hcmut.edu.vn
1. INTRODUCTION
The load curve classification has one important meaning: the utility can draw the own feature for
each group in one class of consumer [1]-[2]. Here the main features such as the seasonal factors, the week
day factors, influencing on the electricity consumption may be extracted. By this way some utilities can make
decision on the tariff by seasons or by day in a week. Some utilities will have the different prices on
electricity for winter, summer. Others will take the prices for working days in difference with those for the
weekend with the very clear purpose: to shift loads from working days to the weekend. Many utilities design
their demand response policy for each customer group having the same form of load curves [3].
Load curve classification is the clustering with the large number of input data. The daily load curve
for years or months must be considered. From the point of data mining, the way of clustering big data is
necessary to extracting useful information. Many authors concentrated on data clustering basing on the K-
mean algorithm because it is rather easy to implement and apply even on large data set. Jung, et al used K-
means algorithms combining with principal component analysis to analyze and classify user data efficiently
[4]. But as mentioned in [5]-[7], K-means has the results that strongly depended on the initial values of the
centers, so this will influence on the clustering results. To over come this drawback, Bedboudi, et al used the
combining K-mean and genetic algorithm, meanwhile Sahu, et al used the Adaptive K-mean [5], [6]. Chiu
presented the subtractive method to remove the influence of center initialization [7]. For load curve
clustering, many works are based on dimensionality reduction in order to simplify the models or reduce the
computation time such as [8]-[10]. Here the feature selection or construction is the main key for clustering.
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835
4830
For example, [10] proposed three ways to construct the features, exceptionally are suitable for smart
metering: conditional filters on time-resolution based features, calibration and normalization, and using
profile errors.
Other works continue to use the advantages of K-mean algorithm and combine with the
dimensionality reduction algorithm for load curve clustering. The popular clustering techniques, based on
this combining, are the SOM & K-means. With the large number of input data, SOM & K-means is a
prominent approach for clustering. In [11] this technique is with two-level approach: first, the data set will be
clustered using the SOM by sequential training algorithm. The result here is a set of prototype vectors. In the
second level, the SOM will be clustered by K-mean. But this method contains the weak points of K-mean so
does not have the high accuracy. Besides, the sequential training algorithm for SOM is time consumption.
To take the full advantage of SOM & K-mean with the big data, to over come its drawback, this
paper will use the subtractive clustering method for the second level. However, choosing the effective radius
is one key question of clustering procedure. We proposed applying the PSO technique to find the optimum
radius in order to improve the accuracy. The paper also used another training way in SOM- the batch training
algorithm to enhance the calculating time. To validate the proposed method, the Fuzzy K-mean algorithm
will be also applied to give the comparison.
The work is organized as the following: some mathematics definition such as SOM, K-mean, Fuzzy
K-mean, PSO will be mentioned in Section 2; the proposed algorithm (denoted as Advanced SOM & K
means) will be presented in Section 3 with some tests on the famous data set; finally, one case study will be
presented in Section 4, comparing the results of different algorithms such as SOM & K-means, Fuzzy K-
mean.
2. SOME MATHEMATIC DEFINITIONS
2.1. SOM
The SOM consists of a regular, usually two-dimensional 2D grid of map units. Data points lying
near each other in the input space are mapped onto nearby map units. The SOM can be interpreted as a
topology preserving mapping from input space onto the 2-D grid of map units.
In our work, the two algorithms for training of the maps were carried out: sequential training
algorithm and batch training. The neuron whose weight vector is closest to the input vector is called the best-
matching unit (BMU) denoted by c. In the batch training algorithm, instead of using a single data vector at a
time, the whole data set is presented to the map before any adjustments are made (hence the name “batch”).
In each training step, the data set is partitioned according to the Voronoi regions of the map weight vectors,
i.e. each data vector belongs to the data set of the closest map unit. After this, the new weight vectors are
calculated as follows:





n
j
ic
n
j
jic
i
th
xth
tm
1
1
)(
)(
)1( (1)
where: t denotes time; xj is an input vector; hci(t) the neighborhood Kernel around the winner unit;
 kjk mxc  minarg is the index of the BMU of data sample, with mk is synaptic weight vector k .
2.2. The K-mean algorithm
The K-mean-algorithm is a well-known algorithm in clustering field. For each cluster number K, the
procedure follows a simple way to classify a given data set and looks like that:
min
2
1 1
  
 
k
i
n
j
ij zxF (2)
where, . is the Euclidean distance between xj and zi.; zi- is the center of the ith
cluster; k- is the number of
clusters centers; n-number of data. The Davies-Bouldin (DB) index is applied for hard clustering [6]. The
optimal number of clusters corresponds to the minimum value of DB index.
Int J Elec & Comp Eng ISSN: 2088-8708 
Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh)
4831
2.3. The subtractive method
Consider a collection of n data points {x1, x2… xn} in an M dimensional space. If each data point is
considering as a possible cluster center, then the potential of data point x i will be:



n
k
xx
i
ik
eP
1
1

(3)
with
2
/4 ar . The constant ra is effectively the radius defining a neighborhood. The data point with the
highest potential is selected as the first cluster center. Let x1* be the location of the first cluster center and P1*
be its potential value. The potential of each data point x i is revised by the formula:
*
1*
1
xx
ii
k
ePPP



(4)
with
2
/4 br , where rb is the effective radius and be equal to 1.25 ra. The data point with the highest
remaining potential is selected as the second cluster center. The process is then continued further until the
remaining potential of all data points falls below some fraction of the potential of the first cluster center P1*.
2.4. The PSO [12]
PSO was based on the phenomenon of collective intelligence inspired by the social behavior of bird
flocking or fish schooling. The fitness function is evaluated for each particle in the swarm and is compared to
the fitness of the best previous position for that particle pbestt and to the fitness of the global best particle
among all particles in the swarm gbest. After finding the two best values, the ith
particles evolve by updating
their velocities and positions according to the following equations:
)(*)(* 2211
1 k
ii
k
ii
k
i
k
i sgbestrandcspbestrandcwVV 
(5)
11 
 k
i
k
i
k
i Vss (6)
where: sk
-current searching point; sk+1
-modified searching point; vk-
-current velocity; vk+-
-modified
velocity; rand1 and rand2- the random values in (0,1) following a normal distribution; c1 and c2 are
constants called acceleration coefficients; w-some weighted coefficient. The values of c1 and c2 control the
weight balance of pbest and gbest in deciding the particle’s next movement.
2.5. Fuzzy K-means (FKM) [13]
FKM is one clustering method with high flexibility having the following objective function:
min),(
1 1
2
 
 
K
i
n
j
jiij xzdwF 
(7)
where α is a weighting exponent; ijw is the value of membership function and d(zi, xj) is the Euclidean
distance between xj and the center zi.of i cluster. For determining the final number of clusters, there are many
criteria are applied. This paper used the methods in [14] based on the principles Bellmand – Zadeh.
3. THE PROPOSED ALGORITHM
The proposed algorithm (denoted as Advanced SOM & K means) will be shown in Figure 1. The
batch training approach is used and the training time will be enhanced. Here the Subtractive clustering is
applied to find out the initial centers for K-means. Traditionally, the radius ra in (3) has the values from 0.15
to 0.8. Our examining shows that the smaller the ra is, the large the number of clusters will be received. So,
the optimum radius is the one that will lead to the smallest value of DB index. To find out the suitable radius
ra, this paper applied the PSO algorithm.
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835
4832
Start
Subtractive
clustering
K-means
DB_index
Min?
End
Yes
ra
Particle Swarm
Optimization
No
Data
Processing
SOM
(Batch training)
Data
clusters
Input
Data
Figure 1. The proposed algorithm
4. EXPERIMENTAL STUDIES
4.1. Testing on the well-known data samples:
Three real and famous data sets (Iris, WBCD, Wine) are taken These data sets are used in many
works for testing the clustering technique. The Iris Plants Database [15] contains 150 samples (4 attributes in
each sample) and was clustered into 3 classes: Iris Setosa; Iris Versicolour; Iris Virginica (50 samples for
each class). The Wisconsin Breast Cancer Database [16] was built from the University of Hospitals. It
contains 683 test (10 attributes in each test) and was clustered into 2 classes: benign (65.5%) and malignant
(35.5%). The last one [17] is the data obtained from a chemical analysis of wines grown in the same region in
Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituent found
in each of the three types of wines. Three algorithms: SOM & K-mean, FKM, and Advanced SOM & K-
mean are applied and the results are given in Table 1. From Table 1, the conclusion is that Advanced SOM &
K-mean has the best result.
Table 1. Testing results on well-known data samples
Data sample
Number of the
correct cluster
Algorithms
SOM &
K-means
FKM
Advanced
SOM & K-means
Iris 3 2 2 3
WBCD 2 3 2 2
Wine 3 2 7 3
4.2. Application for load curve clustering
4.2.1. The input data
The 365 daily load curves of one utility in the South of Vietnam are the input data. Each load curve
is regarded as the vector of 24 attributes (24 hours). The Euclidean distances between two load curves j and k
will be defined as:
 

24
1
2
)(
i
ikijjk xxd (8)
where, xij-load at i-hour of j-load curve.
Int J Elec & Comp Eng ISSN: 2088-8708 
Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh)
4833
4.2.2. Extract the information
From the clustering process, by looking into each cluster, the main factors characterized each cluster
may be extracted. For example if the load curves in one cluster are belonged to the rainy season, while in
other cluster-the dry season, then it can say that there is a necessity to form a seasonal tariff. And if there are
the different clusters by weekend and working day, the weekend day tariff must be formed.
4.2.3. Implementation
As implementation, here the daily load curves of one utility in the year of 2012 were used. The tariff
is TOU (time of use) and is the same for all day in week. All three algorithms have the same number of
cluster (2 clusters) called holiday cluster and normal day cluster. There is no show of the rainy and dry
season clusters. All of Sunday and public holidays are belonged to the holiday cluster. This result is
consistent because the HochiMinh city is with the tropical climate, and on the other hand, there are many
industrial parks and invested abroad enterprises so that the difference in load by seasons is not clearly. The
Holiday cluster contained all of Sunday and public holidays according to Vietnam’s Labor Code. So that
there are 63 days in standard holiday cluster can see in Figure 2. It emphasizes the necessity to form the
different prices on electricity for working days and Holidays. But the result shows that there are more than 63
days in the holidays cluster. There are some Saturdays and working days falling into the holiday cluster can
see in Table 2.
Figure 2. The load curves of 63 standard public holidays
There are differences in the results of 3 algorithms can see in Table 2. To consider the result
accuracy of three algorithms, the distance of each different day to center of the standard holidays (63 days)
and the normal days will be calculated can see in Table 3.
The results of FKM and Advanced SOM & K-means are coincided except for 4 days (Saturdays: 4-
Feb., 11-Feb., 18-Feb, 6-Oct.). According to FKM, theses days belonged to the holiday cluster. But from
Table 3, theses Saturdays have the distance to the center of the standard normal day cluster smaller than of
the standard holiday cluster. It means that these 4 Saturdays must belong to the normal day cluster. And that
means FKM is less accurate than Advanced SOM & K-means.
The results of SOM & K-mean and Advanced SOM & K-means are coincided except for one day
(Tuesday: 31-Jan). This Tuesday has the distance to the center of the standard normal day cluster larger than
the standard holiday cluster and must be belonged to the standard holiday cluster. So, the Advanced SOM &
K-means algorithm gets the better accuracy than SOM & K-means.
Table 2. Number of weekdays in Holiday cluster for 3 algorithms
Weekday
Algorithms
SOM &
K-means
FKM
Advanced SOM &
K-means
Monday 7 7 7
Tuesday 3 4 4
Wednesday 3 3 3
Thursday 2 2 2
Friday 2 2 2
Saturday 4 8 4
Sunday 53 53 53
 ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835
4834
Table 3. Distance of all different days to standard holiday cluster’s center (SHCC) and normal day cluster’s
center (SNDCC)
Day
Avg. dist.
to the SHCC
Avg. dist.
to the SNDCC
31-Jan-12 1030.48 1988.70
4-Feb-12 1742.42 1052.60
11-Feb-12 1753.69 977.21
18-Feb-12 1742.35 1014.14
6-Oct-12 1820.48 1046.44
This emphasizes the fact that Advanced SOM & K-means algorithm overcome the weak point of
those algorithm based on the K-mean, and the choosing of optimal radius in Subtractive method enhances the
accuracy.
4.2.4. Compare in time calculation domain
Changing SOM training by the batch training algorithm greatly reduces training time. Besides,
applying the Subtractive clustering algorithm to get initial center in K-means can lead to quite fast solution
the performance tests were made in a computer with 4 GBs of memory and 2.4 GHz Intel Core i3 CPU and
have the following results: SOM & Kmeans-1599(s); Advanced SOM & K-means-62(s); FKM - 488 (s).
5. CONCLUSION
The data analysis presented in this work has been tested and validated using real data of one utility
and the well-known data samples. Among three algorithms examined in this paper, the proposed Advanced
SOM & K-means has the better result and smallest time for calculating. This algorithm overcomes some
disadvantages of traditional SOM & K-means, FKM. In the results, the daily consumption behavior of a real
utility has been analyzed by clustering and it shows that it is necessary to make different electricity prices for
working days and for weekends. This algorithm can also be used for clustering different groups of customers-
the basic for applying different tariff for different customer classes. For the future works, the study of
possibility to apply this algorithm for detecting time zones of Time-of-Use tariff will be carried out.
ACKNOWLEDGEMENTS
The authors would like to thank the HCMC University of Technology and HCMC University of
Technology and Education for their supports.
REFERENCES
[1] G. Chicco, et al., “Customer characterization options for improving the tariff offer,” IEEE Trans. Power Syst,
vol/issue: 18(1), pp 381-387, 2003.
[2] D. Gerbec, et al., “Determination and allocation of typical load profiles to the eligible customers,” in Proc IEEE
Bologna Power Tech, Bologna Italy, 2003.
[3] S. Valero, et al., “Methods for customer and demand response policies selection in new electricity markets,” IET
Gener. Transm. Distrib., vol/issue: 1(1), pp. 104-110, 2007.
[4] S. H. Jung, et al., “Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for
Big Data,” International Journal of Electrical and Computer Engineering (IJECE), vol/issue: 6(1), pp. 330-336,
2016.
[5] A. Bedboudi, et al., “An Heterogeneous Population-Based Genetic Algorithm for Data Clustering,” Indonesian
Journal of Electrical Engineering and Informatics (IJEEI), vol/issue: 5(3), pp. 275-284, 2017.
[6] M. Sahu, et al., “Parametric Comparison of K-means and Adaptive K-means Clustering Performance on Different
Images,” International Journal of Electrical and Computer Engineering (IJECE), vol/issue: 7(2), pp. 810-817,
2017.
[7] S. L. Chiu, “Fuzzy model identification based on cluster estimation,” Journal of Intelligent and Fuzzy Systems,
vol/issue: 2(3), 1994.
[8] N. Jin, et al., “Subgroup discovery in smart electricity meter data,” Industrial Informatics, IEEE Transactions on,
vol/issue: 10(2), pp. 1327-1336, 2014.
[9] I. Dent, et al., “Variability of behaviour in electricity load profile clustering; who does things at the same time each
day,” in Advances in Data Mining. Applications and Theoretical Aspects, ser. Lecture Notes in Computer Science,
P. Perner, Ed. Springer International Publishing, vol. 8557, pp. 70-84, 2014.
[10] R. Al-Otaibi, et al., “Feature Construction and Calibration for Clustering Daily Load Curves from Smart Meter
Data,” Industrial Informatics, IEEE Transactions on, vol/issue: 12(2), pp. 645-654, 2016.
Int J Elec & Comp Eng ISSN: 2088-8708 
Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh)
4835
[11] S. V. Verdú, et al., “Classification, Filtering, and Identification of Electrical Customer Load Patterns Through the
use of Self-Organizing Maps,” IEEE Transactions on power systems, vol/issue: 21(4), 2006.
[12] M. El-Tarabily, et al., “A PSO – Based on Subtractive Data Clustering Algorithm,” International Journal of
Research in Computer Science, vol/issue: 3(2), pp. 1-9, 2013.
[13] N. R. Pal and J. C. Bezdek, “On Cluster Validity for the Fuzzy c-means model,” IEEE Trans, Fuzzy syst., vol/issue:
3(3), pp. 370-379, 1995.
[14] P. T. T. Binh, et al., “Determination of Representative Load Curve based on Fuzzy K-Means,” Proc. PEOCO,
2010.
[15] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annual Eugenics, vol. 7, Part II, pp.
179-188, 1936.
[16] O. L. Mangasarian and W. H. Wolberg, “Cancer diagnosis via linear programming,” SIAM News, vol/issue: 23(5),
pp. 1-18, 1990.
[17] Forina, et al., “An Extendible Package for Data Exploration,” Classification and Correlation. Institute of
Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno, 16147 Genoa, Italy.
BIOGRAPHIES OF AUTHORS
Phan Thi Thanh Binh received Ph.D. degree in electrical engineering from Kiev Polytechnique
University, Ukraine in 1995. Currently, she is a Assos. professor and lecturer in the Faculty
Electrical and Electronics Engineering, HCMUT. Her main areas of research interests are power
systems stability, power systems operation and control, load forecasting, data mining.
Trong Nghia Le received his M.Sc. degree in electrical engineering from Ho Chi Minh City
University of Technology and Education (HCMUTE), Vietnam, in 2012. Currently, he is a
lecturer in the Faculty Electrical and Electronics Engineering, HCMUTE. His main areas of
research interests are load shedding in power systems, power systems stability, load forecasting
and distribution network.
Nui Pham Xuan received his M.Sc. degree in electrical engineering from Ho Chi Minh City
University of Technology, Vietnam, in 2013. Currently, he works at Quality Assurance and
Testing Center 3 (QUATEST 3). His main area of research interests is data mining.

More Related Content

PDF
Cost Aware Expansion Planning with Renewable DGs using Particle Swarm Optimiz...
PDF
Firefly Algorithm to Opmimal Distribution of Reactive Power Compensation Units
PDF
Clustering using kernel entropy principal component analysis and variable ker...
PDF
Fractal representation of the power demand based on topological properties of...
PDF
Security constrained optimal load dispatch using hpso technique for thermal s...
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
Half Gaussian-based wavelet transform for pooling layer for convolution neura...
Cost Aware Expansion Planning with Renewable DGs using Particle Swarm Optimiz...
Firefly Algorithm to Opmimal Distribution of Reactive Power Compensation Units
Clustering using kernel entropy principal component analysis and variable ker...
Fractal representation of the power demand based on topological properties of...
Security constrained optimal load dispatch using hpso technique for thermal s...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Half Gaussian-based wavelet transform for pooling layer for convolution neura...

What's hot (18)

PDF
Short Term Electrical Load Forecasting by Artificial Neural Network
PDF
Energy detection technique for
PDF
Performance based Comparison of Wind and Solar Distributed Generators using E...
PDF
Applying of Double Seasonal ARIMA Model for Electrical Power Demand Forecasti...
PDF
Comparison of cascade P-PI controller tuning methods for PMDC motor based on ...
PDF
Active Distribution Grid Power Flow Analysis using Asymmetrical Hybrid Techni...
PDF
IRJET- A New Approach to Economic Load Dispatch by using Improved QEMA ba...
PDF
Comparative study to realize an automatic speaker recognition system
PPTX
Siad el quliti economic scheduling the construction of electric transmission
PDF
An Effectively Modified Firefly Algorithm for Economic Load Dispatch Problem
PDF
Advance Data Mining - Analysis and forecasting of power factor for optimum el...
PDF
N03430990106
PDF
Multi-objective Optimization Scheme for PID-Controlled DC Motor
PDF
1 s2.0-s0142061515005086-main
PDF
paper11
PDF
PDF
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
PDF
Multi objective-optimization-with-fuzzy-based-ranking-for-tcsc-supplementary-...
Short Term Electrical Load Forecasting by Artificial Neural Network
Energy detection technique for
Performance based Comparison of Wind and Solar Distributed Generators using E...
Applying of Double Seasonal ARIMA Model for Electrical Power Demand Forecasti...
Comparison of cascade P-PI controller tuning methods for PMDC motor based on ...
Active Distribution Grid Power Flow Analysis using Asymmetrical Hybrid Techni...
IRJET- A New Approach to Economic Load Dispatch by using Improved QEMA ba...
Comparative study to realize an automatic speaker recognition system
Siad el quliti economic scheduling the construction of electric transmission
An Effectively Modified Firefly Algorithm for Economic Load Dispatch Problem
Advance Data Mining - Analysis and forecasting of power factor for optimum el...
N03430990106
Multi-objective Optimization Scheme for PID-Controlled DC Motor
1 s2.0-s0142061515005086-main
paper11
On the-joint-optimization-of-performance-and-power-consumption-in-data-centers
Multi objective-optimization-with-fuzzy-based-ranking-for-tcsc-supplementary-...
Ad

Similar to Advanced SOM & K Mean Method for Load Curve Clustering (20)

PPTX
Master defense presentation 2019 04_18_rev2
PDF
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
PDF
Extended pso algorithm for improvement problems k means clustering algorithm
PDF
84cc04ff77007e457df6aa2b814d2346bf1b
PDF
Active Learning Entropy Sampling based Clustering Optimization Method for Ele...
PDF
ACTIVE LEARNING ENTROPY SAMPLING BASED CLUSTERING OPTIMIZATION METHOD FOR ELE...
PDF
ACTIVE LEARNING ENTROPY SAMPLING BASED CLUSTERING OPTIMIZATION METHOD FOR ELE...
PDF
Active Learning Entropy Sampling based Clustering Optimization Method for Ele...
PDF
Extended pso algorithm for improvement problems k means clustering algorithm
PPTX
presentation 2019 04_09_rev1
PDF
A survey of modified support vector machine using particle of swarm optimizat...
PPTX
ANN(Artificial Neural Networks) Clustering Algorithms
PPTX
Final edited master defense-hyun_wong choi_2019_05_23_rev21
PPTX
master defense hyun-wong choi_2019_05_14_rev19
PPTX
master defense hyun-wong choi_2019_05_14_rev19
PPTX
master defense hyun-wong choi_2019_05_14_rev19
PPTX
defense hyun-wong choi_2019_05_14_rev18
PDF
IRJET- Performance Analysis of Optimization Techniques by using Clustering
PDF
How Partitioning Clustering Technique For Implementing...
PPTX
Mining of time series data base using fuzzy neural information systems
Master defense presentation 2019 04_18_rev2
A Hybrid Data Clustering Approach using K-Means and Simplex Method-based Bact...
Extended pso algorithm for improvement problems k means clustering algorithm
84cc04ff77007e457df6aa2b814d2346bf1b
Active Learning Entropy Sampling based Clustering Optimization Method for Ele...
ACTIVE LEARNING ENTROPY SAMPLING BASED CLUSTERING OPTIMIZATION METHOD FOR ELE...
ACTIVE LEARNING ENTROPY SAMPLING BASED CLUSTERING OPTIMIZATION METHOD FOR ELE...
Active Learning Entropy Sampling based Clustering Optimization Method for Ele...
Extended pso algorithm for improvement problems k means clustering algorithm
presentation 2019 04_09_rev1
A survey of modified support vector machine using particle of swarm optimizat...
ANN(Artificial Neural Networks) Clustering Algorithms
Final edited master defense-hyun_wong choi_2019_05_23_rev21
master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19
master defense hyun-wong choi_2019_05_14_rev19
defense hyun-wong choi_2019_05_14_rev18
IRJET- Performance Analysis of Optimization Techniques by using Clustering
How Partitioning Clustering Technique For Implementing...
Mining of time series data base using fuzzy neural information systems
Ad

More from IJECEIAES (20)

PDF
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
PDF
Embedded machine learning-based road conditions and driving behavior monitoring
PDF
Advanced control scheme of doubly fed induction generator for wind turbine us...
PDF
Neural network optimizer of proportional-integral-differential controller par...
PDF
An improved modulation technique suitable for a three level flying capacitor ...
PDF
A review on features and methods of potential fishing zone
PDF
Electrical signal interference minimization using appropriate core material f...
PDF
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
PDF
Bibliometric analysis highlighting the role of women in addressing climate ch...
PDF
Voltage and frequency control of microgrid in presence of micro-turbine inter...
PDF
Enhancing battery system identification: nonlinear autoregressive modeling fo...
PDF
Smart grid deployment: from a bibliometric analysis to a survey
PDF
Use of analytical hierarchy process for selecting and prioritizing islanding ...
PDF
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...
PDF
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...
PDF
Adaptive synchronous sliding control for a robot manipulator based on neural ...
PDF
Remote field-programmable gate array laboratory for signal acquisition and de...
PDF
Detecting and resolving feature envy through automated machine learning and m...
PDF
Smart monitoring technique for solar cell systems using internet of things ba...
PDF
An efficient security framework for intrusion detection and prevention in int...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Embedded machine learning-based road conditions and driving behavior monitoring
Advanced control scheme of doubly fed induction generator for wind turbine us...
Neural network optimizer of proportional-integral-differential controller par...
An improved modulation technique suitable for a three level flying capacitor ...
A review on features and methods of potential fishing zone
Electrical signal interference minimization using appropriate core material f...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Bibliometric analysis highlighting the role of women in addressing climate ch...
Voltage and frequency control of microgrid in presence of micro-turbine inter...
Enhancing battery system identification: nonlinear autoregressive modeling fo...
Smart grid deployment: from a bibliometric analysis to a survey
Use of analytical hierarchy process for selecting and prioritizing islanding ...
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...
Adaptive synchronous sliding control for a robot manipulator based on neural ...
Remote field-programmable gate array laboratory for signal acquisition and de...
Detecting and resolving feature envy through automated machine learning and m...
Smart monitoring technique for solar cell systems using internet of things ba...
An efficient security framework for intrusion detection and prevention in int...

Recently uploaded (20)

PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Construction Project Organization Group 2.pptx
PPTX
web development for engineering and engineering
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
DOCX
573137875-Attendance-Management-System-original
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
PPT on Performance Review to get promotions
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
bas. eng. economics group 4 presentation 1.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Construction Project Organization Group 2.pptx
web development for engineering and engineering
Internet of Things (IOT) - A guide to understanding
Operating System & Kernel Study Guide-1 - converted.pdf
573137875-Attendance-Management-System-original
Foundation to blockchain - A guide to Blockchain Tech
Model Code of Practice - Construction Work - 21102022 .pdf
R24 SURVEYING LAB MANUAL for civil enggi
PPT on Performance Review to get promotions
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx

Advanced SOM & K Mean Method for Load Curve Clustering

  • 1. International Journal of Electrical and Computer Engineering (IJECE) Vol. 8, No. 6, December 2018, pp. 4829~4835 ISSN: 2088-8708, DOI: 10.11591/ijece.v8i6.pp4829-4835  4829 Journal homepage: http://iaescore.com/journals/index.php/IJECE Advanced SOM & K Mean Method for Load Curve Clustering Phan Thi Thanh Binh1 , Trong Nghia Le2 , Nui Pham Xuan3 1,3 Department of Electrical and Electronics Engineering, HCMC University of Technology, Vietnam 2 Department of Electrical and Electronics Engineering, HCMC University of Technology and Education, Vietnam Article Info ABSTRACT Article history: Received Feb 1, 2018 Revised Jun 30, 2018 Accepted Jul 22, 2018 From the load curve classification for one customer, the main features such as the seasonal factors, the weekday factors influencing on the electricity consumption may be extracted. By this way some utilities can make decision on the tariff by seasons or by day in week. The popular clustering techniques are the SOM & K-mean or Fuzzy K-mean. SOM &Kmean is a prominent approach for clustering with a two-level approach: first, the data set will be clustered using the SOM and in the second level, the SOM will be clustered by K-mean. In the first level, two training algorithms were examined: sequential and batch training. For the second level, the K-mean has the results that are strongly depended on the initial values of the centers. To overcome this, this paper used the subtractive clustering approach proposed by Chiu in 1994 to determine the centers. Because the effective radius in Chiu’s method has some influence on the number of centers, the paper applied the PSO technique to find the optimum radius. To valid the proposed approach, the test on well-known data samples is carried out. The applications for daily load curves of one Southern utility are presented. Keyword: Cluster analysis K-mean PSO SOM Subtractive clustering Copyright © 2018 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Phan Thi Thanh Binh, Department of Electrical and Electronics Engineering, HCMC University of Technology, 268 Ly Thuong Kiet steeet, 10 district, Ho chi minh city, Vietnam. Email: pttbinh@hcmut.edu.vn 1. INTRODUCTION The load curve classification has one important meaning: the utility can draw the own feature for each group in one class of consumer [1]-[2]. Here the main features such as the seasonal factors, the week day factors, influencing on the electricity consumption may be extracted. By this way some utilities can make decision on the tariff by seasons or by day in a week. Some utilities will have the different prices on electricity for winter, summer. Others will take the prices for working days in difference with those for the weekend with the very clear purpose: to shift loads from working days to the weekend. Many utilities design their demand response policy for each customer group having the same form of load curves [3]. Load curve classification is the clustering with the large number of input data. The daily load curve for years or months must be considered. From the point of data mining, the way of clustering big data is necessary to extracting useful information. Many authors concentrated on data clustering basing on the K- mean algorithm because it is rather easy to implement and apply even on large data set. Jung, et al used K- means algorithms combining with principal component analysis to analyze and classify user data efficiently [4]. But as mentioned in [5]-[7], K-means has the results that strongly depended on the initial values of the centers, so this will influence on the clustering results. To over come this drawback, Bedboudi, et al used the combining K-mean and genetic algorithm, meanwhile Sahu, et al used the Adaptive K-mean [5], [6]. Chiu presented the subtractive method to remove the influence of center initialization [7]. For load curve clustering, many works are based on dimensionality reduction in order to simplify the models or reduce the computation time such as [8]-[10]. Here the feature selection or construction is the main key for clustering.
  • 2.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835 4830 For example, [10] proposed three ways to construct the features, exceptionally are suitable for smart metering: conditional filters on time-resolution based features, calibration and normalization, and using profile errors. Other works continue to use the advantages of K-mean algorithm and combine with the dimensionality reduction algorithm for load curve clustering. The popular clustering techniques, based on this combining, are the SOM & K-means. With the large number of input data, SOM & K-means is a prominent approach for clustering. In [11] this technique is with two-level approach: first, the data set will be clustered using the SOM by sequential training algorithm. The result here is a set of prototype vectors. In the second level, the SOM will be clustered by K-mean. But this method contains the weak points of K-mean so does not have the high accuracy. Besides, the sequential training algorithm for SOM is time consumption. To take the full advantage of SOM & K-mean with the big data, to over come its drawback, this paper will use the subtractive clustering method for the second level. However, choosing the effective radius is one key question of clustering procedure. We proposed applying the PSO technique to find the optimum radius in order to improve the accuracy. The paper also used another training way in SOM- the batch training algorithm to enhance the calculating time. To validate the proposed method, the Fuzzy K-mean algorithm will be also applied to give the comparison. The work is organized as the following: some mathematics definition such as SOM, K-mean, Fuzzy K-mean, PSO will be mentioned in Section 2; the proposed algorithm (denoted as Advanced SOM & K means) will be presented in Section 3 with some tests on the famous data set; finally, one case study will be presented in Section 4, comparing the results of different algorithms such as SOM & K-means, Fuzzy K- mean. 2. SOME MATHEMATIC DEFINITIONS 2.1. SOM The SOM consists of a regular, usually two-dimensional 2D grid of map units. Data points lying near each other in the input space are mapped onto nearby map units. The SOM can be interpreted as a topology preserving mapping from input space onto the 2-D grid of map units. In our work, the two algorithms for training of the maps were carried out: sequential training algorithm and batch training. The neuron whose weight vector is closest to the input vector is called the best- matching unit (BMU) denoted by c. In the batch training algorithm, instead of using a single data vector at a time, the whole data set is presented to the map before any adjustments are made (hence the name “batch”). In each training step, the data set is partitioned according to the Voronoi regions of the map weight vectors, i.e. each data vector belongs to the data set of the closest map unit. After this, the new weight vectors are calculated as follows:      n j ic n j jic i th xth tm 1 1 )( )( )1( (1) where: t denotes time; xj is an input vector; hci(t) the neighborhood Kernel around the winner unit;  kjk mxc  minarg is the index of the BMU of data sample, with mk is synaptic weight vector k . 2.2. The K-mean algorithm The K-mean-algorithm is a well-known algorithm in clustering field. For each cluster number K, the procedure follows a simple way to classify a given data set and looks like that: min 2 1 1      k i n j ij zxF (2) where, . is the Euclidean distance between xj and zi.; zi- is the center of the ith cluster; k- is the number of clusters centers; n-number of data. The Davies-Bouldin (DB) index is applied for hard clustering [6]. The optimal number of clusters corresponds to the minimum value of DB index.
  • 3. Int J Elec & Comp Eng ISSN: 2088-8708  Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh) 4831 2.3. The subtractive method Consider a collection of n data points {x1, x2… xn} in an M dimensional space. If each data point is considering as a possible cluster center, then the potential of data point x i will be:    n k xx i ik eP 1 1  (3) with 2 /4 ar . The constant ra is effectively the radius defining a neighborhood. The data point with the highest potential is selected as the first cluster center. Let x1* be the location of the first cluster center and P1* be its potential value. The potential of each data point x i is revised by the formula: * 1* 1 xx ii k ePPP    (4) with 2 /4 br , where rb is the effective radius and be equal to 1.25 ra. The data point with the highest remaining potential is selected as the second cluster center. The process is then continued further until the remaining potential of all data points falls below some fraction of the potential of the first cluster center P1*. 2.4. The PSO [12] PSO was based on the phenomenon of collective intelligence inspired by the social behavior of bird flocking or fish schooling. The fitness function is evaluated for each particle in the swarm and is compared to the fitness of the best previous position for that particle pbestt and to the fitness of the global best particle among all particles in the swarm gbest. After finding the two best values, the ith particles evolve by updating their velocities and positions according to the following equations: )(*)(* 2211 1 k ii k ii k i k i sgbestrandcspbestrandcwVV  (5) 11   k i k i k i Vss (6) where: sk -current searching point; sk+1 -modified searching point; vk- -current velocity; vk+- -modified velocity; rand1 and rand2- the random values in (0,1) following a normal distribution; c1 and c2 are constants called acceleration coefficients; w-some weighted coefficient. The values of c1 and c2 control the weight balance of pbest and gbest in deciding the particle’s next movement. 2.5. Fuzzy K-means (FKM) [13] FKM is one clustering method with high flexibility having the following objective function: min),( 1 1 2     K i n j jiij xzdwF  (7) where α is a weighting exponent; ijw is the value of membership function and d(zi, xj) is the Euclidean distance between xj and the center zi.of i cluster. For determining the final number of clusters, there are many criteria are applied. This paper used the methods in [14] based on the principles Bellmand – Zadeh. 3. THE PROPOSED ALGORITHM The proposed algorithm (denoted as Advanced SOM & K means) will be shown in Figure 1. The batch training approach is used and the training time will be enhanced. Here the Subtractive clustering is applied to find out the initial centers for K-means. Traditionally, the radius ra in (3) has the values from 0.15 to 0.8. Our examining shows that the smaller the ra is, the large the number of clusters will be received. So, the optimum radius is the one that will lead to the smallest value of DB index. To find out the suitable radius ra, this paper applied the PSO algorithm.
  • 4.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835 4832 Start Subtractive clustering K-means DB_index Min? End Yes ra Particle Swarm Optimization No Data Processing SOM (Batch training) Data clusters Input Data Figure 1. The proposed algorithm 4. EXPERIMENTAL STUDIES 4.1. Testing on the well-known data samples: Three real and famous data sets (Iris, WBCD, Wine) are taken These data sets are used in many works for testing the clustering technique. The Iris Plants Database [15] contains 150 samples (4 attributes in each sample) and was clustered into 3 classes: Iris Setosa; Iris Versicolour; Iris Virginica (50 samples for each class). The Wisconsin Breast Cancer Database [16] was built from the University of Hospitals. It contains 683 test (10 attributes in each test) and was clustered into 2 classes: benign (65.5%) and malignant (35.5%). The last one [17] is the data obtained from a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituent found in each of the three types of wines. Three algorithms: SOM & K-mean, FKM, and Advanced SOM & K- mean are applied and the results are given in Table 1. From Table 1, the conclusion is that Advanced SOM & K-mean has the best result. Table 1. Testing results on well-known data samples Data sample Number of the correct cluster Algorithms SOM & K-means FKM Advanced SOM & K-means Iris 3 2 2 3 WBCD 2 3 2 2 Wine 3 2 7 3 4.2. Application for load curve clustering 4.2.1. The input data The 365 daily load curves of one utility in the South of Vietnam are the input data. Each load curve is regarded as the vector of 24 attributes (24 hours). The Euclidean distances between two load curves j and k will be defined as:    24 1 2 )( i ikijjk xxd (8) where, xij-load at i-hour of j-load curve.
  • 5. Int J Elec & Comp Eng ISSN: 2088-8708  Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh) 4833 4.2.2. Extract the information From the clustering process, by looking into each cluster, the main factors characterized each cluster may be extracted. For example if the load curves in one cluster are belonged to the rainy season, while in other cluster-the dry season, then it can say that there is a necessity to form a seasonal tariff. And if there are the different clusters by weekend and working day, the weekend day tariff must be formed. 4.2.3. Implementation As implementation, here the daily load curves of one utility in the year of 2012 were used. The tariff is TOU (time of use) and is the same for all day in week. All three algorithms have the same number of cluster (2 clusters) called holiday cluster and normal day cluster. There is no show of the rainy and dry season clusters. All of Sunday and public holidays are belonged to the holiday cluster. This result is consistent because the HochiMinh city is with the tropical climate, and on the other hand, there are many industrial parks and invested abroad enterprises so that the difference in load by seasons is not clearly. The Holiday cluster contained all of Sunday and public holidays according to Vietnam’s Labor Code. So that there are 63 days in standard holiday cluster can see in Figure 2. It emphasizes the necessity to form the different prices on electricity for working days and Holidays. But the result shows that there are more than 63 days in the holidays cluster. There are some Saturdays and working days falling into the holiday cluster can see in Table 2. Figure 2. The load curves of 63 standard public holidays There are differences in the results of 3 algorithms can see in Table 2. To consider the result accuracy of three algorithms, the distance of each different day to center of the standard holidays (63 days) and the normal days will be calculated can see in Table 3. The results of FKM and Advanced SOM & K-means are coincided except for 4 days (Saturdays: 4- Feb., 11-Feb., 18-Feb, 6-Oct.). According to FKM, theses days belonged to the holiday cluster. But from Table 3, theses Saturdays have the distance to the center of the standard normal day cluster smaller than of the standard holiday cluster. It means that these 4 Saturdays must belong to the normal day cluster. And that means FKM is less accurate than Advanced SOM & K-means. The results of SOM & K-mean and Advanced SOM & K-means are coincided except for one day (Tuesday: 31-Jan). This Tuesday has the distance to the center of the standard normal day cluster larger than the standard holiday cluster and must be belonged to the standard holiday cluster. So, the Advanced SOM & K-means algorithm gets the better accuracy than SOM & K-means. Table 2. Number of weekdays in Holiday cluster for 3 algorithms Weekday Algorithms SOM & K-means FKM Advanced SOM & K-means Monday 7 7 7 Tuesday 3 4 4 Wednesday 3 3 3 Thursday 2 2 2 Friday 2 2 2 Saturday 4 8 4 Sunday 53 53 53
  • 6.  ISSN: 2088-8708 Int J Elec & Comp Eng, Vol. 8, No. 6, December 2018 : 4828 - 4835 4834 Table 3. Distance of all different days to standard holiday cluster’s center (SHCC) and normal day cluster’s center (SNDCC) Day Avg. dist. to the SHCC Avg. dist. to the SNDCC 31-Jan-12 1030.48 1988.70 4-Feb-12 1742.42 1052.60 11-Feb-12 1753.69 977.21 18-Feb-12 1742.35 1014.14 6-Oct-12 1820.48 1046.44 This emphasizes the fact that Advanced SOM & K-means algorithm overcome the weak point of those algorithm based on the K-mean, and the choosing of optimal radius in Subtractive method enhances the accuracy. 4.2.4. Compare in time calculation domain Changing SOM training by the batch training algorithm greatly reduces training time. Besides, applying the Subtractive clustering algorithm to get initial center in K-means can lead to quite fast solution the performance tests were made in a computer with 4 GBs of memory and 2.4 GHz Intel Core i3 CPU and have the following results: SOM & Kmeans-1599(s); Advanced SOM & K-means-62(s); FKM - 488 (s). 5. CONCLUSION The data analysis presented in this work has been tested and validated using real data of one utility and the well-known data samples. Among three algorithms examined in this paper, the proposed Advanced SOM & K-means has the better result and smallest time for calculating. This algorithm overcomes some disadvantages of traditional SOM & K-means, FKM. In the results, the daily consumption behavior of a real utility has been analyzed by clustering and it shows that it is necessary to make different electricity prices for working days and for weekends. This algorithm can also be used for clustering different groups of customers- the basic for applying different tariff for different customer classes. For the future works, the study of possibility to apply this algorithm for detecting time zones of Time-of-Use tariff will be carried out. ACKNOWLEDGEMENTS The authors would like to thank the HCMC University of Technology and HCMC University of Technology and Education for their supports. REFERENCES [1] G. Chicco, et al., “Customer characterization options for improving the tariff offer,” IEEE Trans. Power Syst, vol/issue: 18(1), pp 381-387, 2003. [2] D. Gerbec, et al., “Determination and allocation of typical load profiles to the eligible customers,” in Proc IEEE Bologna Power Tech, Bologna Italy, 2003. [3] S. Valero, et al., “Methods for customer and demand response policies selection in new electricity markets,” IET Gener. Transm. Distrib., vol/issue: 1(1), pp. 104-110, 2007. [4] S. H. Jung, et al., “Prediction Data Processing Scheme using an Artificial Neural Network and Data Clustering for Big Data,” International Journal of Electrical and Computer Engineering (IJECE), vol/issue: 6(1), pp. 330-336, 2016. [5] A. Bedboudi, et al., “An Heterogeneous Population-Based Genetic Algorithm for Data Clustering,” Indonesian Journal of Electrical Engineering and Informatics (IJEEI), vol/issue: 5(3), pp. 275-284, 2017. [6] M. Sahu, et al., “Parametric Comparison of K-means and Adaptive K-means Clustering Performance on Different Images,” International Journal of Electrical and Computer Engineering (IJECE), vol/issue: 7(2), pp. 810-817, 2017. [7] S. L. Chiu, “Fuzzy model identification based on cluster estimation,” Journal of Intelligent and Fuzzy Systems, vol/issue: 2(3), 1994. [8] N. Jin, et al., “Subgroup discovery in smart electricity meter data,” Industrial Informatics, IEEE Transactions on, vol/issue: 10(2), pp. 1327-1336, 2014. [9] I. Dent, et al., “Variability of behaviour in electricity load profile clustering; who does things at the same time each day,” in Advances in Data Mining. Applications and Theoretical Aspects, ser. Lecture Notes in Computer Science, P. Perner, Ed. Springer International Publishing, vol. 8557, pp. 70-84, 2014. [10] R. Al-Otaibi, et al., “Feature Construction and Calibration for Clustering Daily Load Curves from Smart Meter Data,” Industrial Informatics, IEEE Transactions on, vol/issue: 12(2), pp. 645-654, 2016.
  • 7. Int J Elec & Comp Eng ISSN: 2088-8708  Advanced Som & K Mean Method for Load Curve Clustering (Phan Thi Thanh Binh) 4835 [11] S. V. Verdú, et al., “Classification, Filtering, and Identification of Electrical Customer Load Patterns Through the use of Self-Organizing Maps,” IEEE Transactions on power systems, vol/issue: 21(4), 2006. [12] M. El-Tarabily, et al., “A PSO – Based on Subtractive Data Clustering Algorithm,” International Journal of Research in Computer Science, vol/issue: 3(2), pp. 1-9, 2013. [13] N. R. Pal and J. C. Bezdek, “On Cluster Validity for the Fuzzy c-means model,” IEEE Trans, Fuzzy syst., vol/issue: 3(3), pp. 370-379, 1995. [14] P. T. T. Binh, et al., “Determination of Representative Load Curve based on Fuzzy K-Means,” Proc. PEOCO, 2010. [15] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annual Eugenics, vol. 7, Part II, pp. 179-188, 1936. [16] O. L. Mangasarian and W. H. Wolberg, “Cancer diagnosis via linear programming,” SIAM News, vol/issue: 23(5), pp. 1-18, 1990. [17] Forina, et al., “An Extendible Package for Data Exploration,” Classification and Correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno, 16147 Genoa, Italy. BIOGRAPHIES OF AUTHORS Phan Thi Thanh Binh received Ph.D. degree in electrical engineering from Kiev Polytechnique University, Ukraine in 1995. Currently, she is a Assos. professor and lecturer in the Faculty Electrical and Electronics Engineering, HCMUT. Her main areas of research interests are power systems stability, power systems operation and control, load forecasting, data mining. Trong Nghia Le received his M.Sc. degree in electrical engineering from Ho Chi Minh City University of Technology and Education (HCMUTE), Vietnam, in 2012. Currently, he is a lecturer in the Faculty Electrical and Electronics Engineering, HCMUTE. His main areas of research interests are load shedding in power systems, power systems stability, load forecasting and distribution network. Nui Pham Xuan received his M.Sc. degree in electrical engineering from Ho Chi Minh City University of Technology, Vietnam, in 2013. Currently, he works at Quality Assurance and Testing Center 3 (QUATEST 3). His main area of research interests is data mining.