Some links might not be
active; this could be because the document has not been archived yet,
or because it is under revision, or because it cannot be disclosed or
because it is only available in hardcopy form. Please notify us if
you need something that appears not to be available. Please note that
"CEID" is an acronym for the "Computer Engineering and
Informatics Department" and
"TR" for "Technical Report". Some reports are
available from http://arxiv.org.
Finally, please note the copyright restrictions on some of these
publications that might limit such downloads to users from non-profit
and educational institutions. Information is also available from Google Scholar.
Some software from our group
TMG: Toolbox that can be used for various tasks in text mining
(TM) specifically i) indexing, ii)
retrieval, iii) dimensionality reduction, iv) clustering, v)
classification. Most of TMG is written in MATLAB, though a large
segment of the indexing phase is written in Perl. TMG is
especially suited for TM applications where data is
high-dimensional but extremely sparse as it uses the sparse
matrix infrastructure of MATLAB. Initially built as a preprocessing
tool for creating term-document matrices (tdm's) from
unstructured text that was reportedly used with success by
several researchers and instructors, the new
version of TMG (May'07) offers a much wider range of tools. |
Jylab: Portable and
flexible scientific computing environment running on all platforms
providing a recent JVM and enabling the development of scientific
applications over distributed computing platforms. Jylab
conveniently packages Jython (<http://www.jython.org/>)
for flexible Python language scripting, with a core set of open source
libraries implementing numerical linear algebra routines (NLA) and
communication models. Recently, a package was implemented iin Jylab
that enables accessing and using the Grid infrastructure. |
NNDSVD: MATLAB functions to initialize approximate nonnegative
matrix factorization algorithms.The basic algorithm contains no
randomization and is based on approximations of positive sections of
the partial SVD factors of the data matrix utilizing an algebraic
property of unit rank matrices. The method is also suitable when
seeking sparse factors. The approximants furnished by NNDSVD appear to
lead to much faster error reduction compared to random initialization
though the eventual error is of similar quality. |
IRLANB: MATLAB functions to compute the smallest singular triplets
of large sparse matrices in matrix free manner. The algorithms used are
based on Lanczos bidiagonalization, implicit restarting, and
harmonic Ritz values, deflation and refinement. The method has been
used with success in applications such as the computation of matrix
pseudospectra and clustering. |
Selected
publications / reports / presentations