What kind of dataset you require.
you can google for dataset and there are lots of dataset like this http://www.cs.umb.edu/~smimarog/textmining/datasets/index.html.
There are some tools which can help you in clustering and classification like weka, rapidminer, heuristiclab and some others. To use these tools you have to read what input format it accepts and then you have to make an input file in the required format. For instance weka accepts input in two formats like csv and arff. You have to study it and work on it then. I hope it works. If you need further help just write to me.
From: Musstanser Tinauli <musstanser@gmail.com>
you can google for dataset and there are lots of dataset like this http://www.cs.umb.edu/~smimarog/textmining/datasets/index.html.
There are some tools which can help you in clustering and classification like weka, rapidminer, heuristiclab and some others. To use these tools you have to read what input format it accepts and then you have to make an input file in the required format. For instance weka accepts input in two formats like csv and arff. You have to study it and work on it then. I hope it works. If you need further help just write to me.
From: Musstanser Tinauli <musstanser@gmail.com>
To: "pakgrid@yahoogroups.com" <pakgrid@yahoogroups.com>
Sent: Wednesday, 5 September 2012, 17:51
Subject: Re: [pakgrid] Help Needed regarding document by term matrix
Sent: Wednesday, 5 September 2012, 17:51
Subject: Re: [pakgrid] Help Needed regarding document by term matrix
Hi, I am no expert but perhaps you can use texts from newspapers? For one i know dawn is accessible online.
Best, Musstanser.
Sent from my iPhone.Dear All,I am MS student and doing research in Text clustering and classification, Currently I need some kind of tool to make real time data set from text so that I can test my algorithms.Kindly anyone help me in this regard, I want to make document by term matrix from plain text.Thanks in advance.Regards,Asif
__._,_.___
.
__,_._,___
No comments:
Post a Comment