\documentclass[10pt,a4paper,twoside]{article}
\usepackage{amsfonts}
%\usepackage{MDIS}
%\addtolength{\oddsidemargin}{1.3cm}
%\addtolength{\evensidemargin}{-0.1cm}
\setlength{\topmargin}{-1.5cm}
%\setlength{\textheight}{20cm} \setlength{\textwidth}{13cm}
\setlength{\textheight}{8.875in} \setlength{\textwidth}{6.3in}
\setlength{\oddsidemargin}{.077in}
\setlength{\evensidemargin}{.077in} \thispagestyle{empty}
\pagestyle{empty}
\begin{document}
%\tableofcontents
\def\SHORTTITLE {Complex polynomial SVM kernel}%
\vspace*{3cm}
%\antet
\markboth{\hfill Dana Simian}{\hfill \SHORTTITLE}
\begin{center}
{\Large \bf A model for a complex polynomial SVM kernel}
\par\vspace*{0.5cm}
{\bf Dana Simian}
\end{center}
\vspace*{1cm}
%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%% PROOF.TEX %%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%
\tolerance 10000
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{definition}{Definition}
\newtheorem{example}{Example}
\newtheorem{xca}{Exercise}
\newtheorem{remark}{Remark}
\newtheorem{proposition}{Proposition}
\newtheorem{corollary}{Corollary}
%Please use these definitions for Latex entities (theorem, lemma, etc.)
%If you need other definitions add to this list and notify us, by e-mail, about this.
%--------------------------------------
\begin{abstract}
The aim of this paper is to present many computational aspects
related to polynomial spaces of $w$-degree. We obtained algorithms
for computing the dimension of homogeneous spaces of $w$-degree $n$
and the exponents of the monomial basis of these spaces. These
elements are very important for using these spaces as interpolation
spaces. We discuss aspects regarding the complexity of these
algorithms and implementation details.
\end{abstract}
% Next you must introduce the contents of your article
\section{Introduction}
\label{s1} Support Vector Machines (SVMs) represent a class of
neural networks, introduced by Vapnik (\cite{v1}). We will make a
brief presentation of SVMs for binary classification in section
\ref{s2}. We limit our discussion to the binary case. There are many
methods to make generalizations for several classes. In the recent
years, SVMs have become a very popular tool for machine learning
tasks and have been successfully applied in classification,
regression and novelty detection. Many applications have been done
in various fields: particle identification, face identification,
text categorization,
bioinformatics, database marketing.\\
\hspace*{0.6cm}Generally, the task of classification is to find a
rule, which based on external observations assign an object to one
of several classes. A classification task supposes the existence of
training and testing data given in the form of data instances. Each
instance in the training set contains one target value, named class
label and several attributes named features. The goal of SVM is to
produce a model which predicts target value of data instances in the
testing set which are given only the attributes. SVM solutions are
characterized by a convex optimization problem. If the data is
separable we obtain an optimal separating hyperplane with a maximal
margin (\cite{c1}). In order to avoid the difficulties for the non
separable data the idea of kernel substitution is used. The kernel
methods, transform data into a feature space $F$, that usually
has a huge dimension (\cite{v1}). With a suitable choice of kernel
the data can become separable in feature space despite being
non-separable by a hyperplane in the original input space. The basic
properties of a kernel function are derived from Mercer's theorem
(\cite{c1}). Under certain conditions, kernel functions can be
interpreted as representing the inner product of data objects
implicitly mapped into a nonlinear feature space. The "kernel trick"
is to calculate the inner product in the feature space without
knowing explicit the mapping function.
\section{Support Vector Machines and kernels}
\label{s2} \hspace*{0.6cm}An SVM algorithm can solve the problem of
binary or multiclass classification. Many real-life data sets
involve multiclass classification. We consider in this section only
the problem of binary classification be cause there are many known
methods to generalize the binary classifier to an $n$ - class
classifier (\cite{c1}, \cite{v1}).\\
\hspace*{0.6cm}Let be given the data points $x_i\in R^d$, $i\in
\{1,\ldots,m\}$ and their labels $y_i\in \{-1,1\}$. We look for a
function $f$ which associates to each input data $x$ its correct
label $y=f(x)$. This function is named decision function and
represents a hyperplane which divides the input data into two
regions:
\begin{equation}
\label{e1} f(x)=sign (\langle w,x\rangle +b),
\end{equation}
where $w=\sum_{i=1}^m \alpha_i
x_i$.If the data set is separable then the conditions for
classification without training error are $y_i(\langle w,x_i\rangle
+b)>0$. To maximize the margin the task is
\[min \le \left( \frac{1}{2}\parallel w\parallel^2 \right),\]
subject to the constraints
\[y_i(\langle w,x_i\rangle +b)\ge 0,\ \forall i\in\{1,\ldots,n\}\]
The Wolfe dual problem requires the maximization with respect to
$\alpha_i$, of the function \begin{equation}\label{e3}
W(\alpha)=\sum_{i=1}^m \alpha_i -\frac{1}{2}\sum_{i,j=1}^m
\alpha_i\alpha_jy_iy_j\langle x_i,x_j\rangle
\end{equation}
subject to the constraints
\begin{equation}
\label{e4} \alpha_i\ge 0,\quad \sum_{i=1}^m \alpha_i y_i=0,
\end{equation}
with $\alpha_i$ Lagrange multipliers (hence $\alpha_i\ge 0$).
\section{Main results. Our model}
\label{s4}
\hspace*{0.6cm}We want to build and analyze a multiple kernel
starting from the simple polynomial kernel and having 2 parameters,
the degree $d$ and the coefficient $r$.
\subsection{Representation of the multiple kernel}
Our chromosome is composed from 78 genes: 2 genes for each operation,
2 genes for the kernel's type, 4 genes for the degree parameter $d_i$,
12 genes for $r_j$. If the associated kernel is not polynomial,
the last 16 genes are used to represent the real value of parameter $\gamma_i$, in place
of $d_i$ and $r_i$.\\
{\bf Acknowledgement:} This paper is founded from the research Grant
CNCSIS 33/2007.
\begin{thebibliography}{99}
\bibitem{c1}
C. Campbell , \newblock{\em An Introduction to Kernel Methods Radial Basis Function Network: Design and Applications},
Springer Verlag, Berlin, 2000. \vspace{-7pt}
\bibitem{d1}
L. Diosan, M. Oltean, Rogozan A., J. P. Pecuchet, Improving svm
performance using a linear combination of kernels, {\em Adaptive
and Natural Computing Algorithms, ICANNGA’07}, volume 4432 of LNCS,
2007, 218 -– 227. \vspace{-7pt}
\bibitem{v1}
Vapnik V., {\em The Nature of Statistical Learning Theory}, Springer,
1995.
\end{thebibliography}
%Details about name and affiliation
% The complete details about authors affiliation will be written in the end of the article
%If there is only one author of the paper, please put blank space for the information of the second author.
%The information of the second author is situated, on each line, before the sign &
\vspace*{1cm} {\footnotesize
\begin{tabular*}{16cm}{p{8.2cm}p{8.2cm}}
Dana Simian& \\
University Lucian Blaga of Sibiu & \\
Faculty of Sciences & \\
5-7 I. Ratiu str., Sibiu, 550021 & \\
ROMANIA & \\
E-mail: \ {\it dana.simian@ulbsibiu.ro}&
\end{tabular*}}
% If there are 3 authors of the paper, please replace complete the part given before, for 2 authors,
% with the following part given for 3 authors, without the sign %, from the beginning of each line.
%\vspace*{1cm} {\footnotesize
%\begin{tabular*}{16cm}{p{5.6cm}p{5.6cm}p{5.6cm}}
%First author complete name& Second author complete name & Third author complete name\\
% Name of institution 1 & Name of institution 2& Name of institution 3\\
%Name of faculty or department1& Name of faculty or department2 & Name of faculty or department3\\
%Address of institution 1 & Address of institution 2 & Address of institution 3\\
%COUNTRY 1& COUNTRY 2& COUNTRY 3\\
% E-mail: \ {\it name1@server1}& E-mail: \ {\it name2@server2}& E-mail: \ {\it name3@server3}
%\end{tabular*}}
\end{document}