Description Details References
This part introduces briefly fuzzy rough set theory (FRST) and its application to data analysis. Since recently there are a lot of FRST variants that have been proposed by researchers, in this introduction, we only provide some basic concepts of FRST based on (Radzikowska and Kerre, 2002).
Just like in RST (see A.Introduction-RoughSets
),
a data set is represented as a table called an information system \mathcal{A} = (U, A), where
U is a non-empty set of finite objects as the universe of discourse (note: it refers to all instances/experiments/rows
in datasets) and A is a non-empty finite set of attributes, such that a : U \to V_{a} for every a \in A.
The set V_{a} is the set of values that attribute a may take. Information systems that involve a decision attribute,
containing classes or decision values of each objects, are called decision systems (or said as decision tables). More formally, it is a pair \mathcal{A} = (U, A \cup \{d\}),
where d \notin A is the decision attribute. The elements of A are called conditional attributes. However, different from RST, FRST has several ways
to express indiscernibility.
Fuzzy indiscernibility relation (FIR) is used for any fuzzy relation that determines the degree to which two objects are indiscernible. We consider some special cases of FIR.
fuzzy tolerance relation: this relation has properties which are reflexive and symmetric where
reflexive: R(x,x) = 1
symmetric: R(x,y) = R(y,x)
similarity relation (also called fuzzy equivalence relation): this relation has properties not only reflexive and symmetric but also transitive defined as
min(R(x,y), R(y,z)) ≤ R(x,z)
\mathcal{T}-similarity relation (also called fuzzy \mathcal{T}-equivalence relation): this relation is a fuzzy tolerance relation that is also \mathcal{T}-transitive.
\mathcal{T}(R(x,y), R(y,z)) ≤ R(x,z), for a given triangular norm \mathcal{T}.
The following equations are the tolerance relations on a quantitative attribute a, R_a, proposed by (Jensen and Shen, 2009).
eq.1
: R_a(x,y) = 1 - \frac{|a(x) - a(y)|}{|a_{max} - a_{min}|}
eq.2
: R_a(x,y) = exp(-\frac{(a(x) - a(y))^2}{2 σ_a^2})
eq.3
: R_a(x,y) = max(min(\frac{a(y) - a(x) + σ_a}{σ_a}, \frac{a(x) - a(y) + σ_a}{σ_a}), 0)
where σ_{a}^2 is the variance of feature a and a_{min} and a_{max} are the minimal and maximal values of data supplied by user.
Additionally, other relations have been implemented in BC.IND.relation.FRST
For a qualitative (i.e., nominal) attribute a, the classical manner of discerning objects is used, i.e., R_a(x,y) = 1 if a(x) = a(y) and R_a(x,y) = 0, otherwise. We can then define, for any subset B of A, the fuzzy B-indiscernibility relation by
R_B(x,y) = \mathcal{T}(R_a(x,y)),
where \mathcal{T} is a t-norm operator, for instance minimum, product and Lukasiewicz t-norm. In general, \mathcal{T} can be replaced by any aggregation operator, like e.g., the average.
In the context of FRST, according to (Radzikowska and Kerre, 2002) lower and upper approximation are generalized by means of an implicator \mathcal{I} and a t-norm \mathcal{T}. The following are the fuzzy B-lower and B-upper approximations of a fuzzy set A in U
(R_B \downarrow A)(y) = inf_{x \in U} \mathcal{I}(R_B(x,y), A(x))
(R_B \uparrow A)(y) = sup_{x \in U} \mathcal{T}(R_B(x,y), A(x))
The underlying meaning is that R_B \downarrow A is the set of elements necessarily satisfying
the concept (strong membership), while R_B \uparrow A is the set of elements possibly belonging
to the concept (weak membership). Many other ways to define the approximations can be found in BC.LU.approximation.FRST
.
Mainly, these were designed to deal with noise in the data and to make the approximations more robust.
Based on fuzzy B-indiscernibility relations, we define the fuzzy B-positive region by, for y \in X,
POS_B(y) = (\cup_{x \in U} R_B \downarrow R_dx)(y)
We can define the degree of dependency of d on B, γ_{B} by
γ_{B} = \frac{|POS_{B}|}{|U|} = \frac{∑_{x \in U} POS_{B}(x)}{|U|}
A decision reduct is a set B \subseteq A such that γ_{B} = γ_{A} and γ_{B'} = γ_{B} for every B' \subset B.
As we know from rough set concepts (See A.Introduction-RoughSets
), we are able to calculate the
decision reducts by constructing the decision-relative discernibility matrix. Based on (Tsang et al, 2008), the discernibility matrix
can be defined as follows.
The discernibility matrix is an n \times n matrix (c_{ij}) where
for i,j = 1, …, n
1) c_{ij}= \{a \in A : 1 - R_{a}(x_i, x_j) ≥ λ_i\} if λ_j < λ_i.
2) c_{ij}={\oslash}, otherwise.
with λ_i = (R_A \downarrow R_{d}x_{i})(x_i) and λ_j = (R_A \downarrow R_{d}x_{j})(x_{j})
Other approaches of discernibility matrix can be read at BC.discernibility.mat.FRST
.
The other implementations of the FRST concepts can be seen at BC.IND.relation.FRST
,
BC.LU.approximation.FRST
, and BC.positive.reg.FRST
.
A. M. Radzikowska and E. E. Kerre, "A Comparative Study of Fuzzy Rough Sets", Fuzzy Sets and Systems, vol. 126, p. 137 - 156 (2002).
D. Dubois and H. Prade, "Rough Fuzzy Sets and Fuzzy Rough Sets", International Journal of General Systems, vol. 17, p. 91 - 209 (1990).
E. C. C. Tsang, D. G. Chen, D. S. Yeung, X. Z. Wang, and J. W. T. Lee, "Attributes Reduction Using Fuzzy Rough Sets", IEEE Trans. Fuzzy Syst., vol. 16, no. 5, p. 1130 - 1141 (2008).
L. A. Zadeh, "Fuzzy Sets", Information and Control, vol. 8, p. 338 - 353 (1965).
R. Jensen and Q. Shen, "New Approaches to Fuzzy-Rough Feature Selection", IEEE Trans. on Fuzzy Systems, vol. 19, no. 4, p. 824 - 838 (2009).
Z. Pawlak, "Rough Sets", International Journal of Computer and Information Sciences, vol. 11, no. 5, p. 341 - 356 (1982).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.