Some big and small datasets available on Internet

A non-exhaustive list of big and small datasets or sites that have further lists of data-sets is given below. Most of the datasets are free.

1.Data and Story Library: DASL (pronounced “dazzle”) is an online library of datafiles and stories that illustrate the use of basic statistics methods. The authors hope to provide data from a wide variety of topics so that statistics teachers can find real-world examples that will be interesting to their students. DASL’s powerful search engine can be used to locate the story or datafile of interest.
2. Infochimps NASDAQ: NASDAQ and other stock exchanges data sets from 1970-2010. Data set contaons open, high, low and volume
3. Time Series Data Library: The Time Series Data Library was created by Rob Hyndman, Pro­fessor of Stat­ist­ics at Mon­ash Uni­ver­sity, Aus­tralia.
4. DataMarket: Find, understand and share data: The open portal to thousands of datasets from leading global providers. You can upload your own data and it is free.
It is an open Data Portal with thousands of datasets from leading global providers.
5. SeanLahaman team of researchers has integrated baseball playing statistics from the 2012 season. The updated version contains complete batting and pitching statistics back to 1871, plus fielding statistics, standings, team stats, managerial records, post-season data, and more. The database can be used on any platform, but please be aware that this is not a standalone application. It is a database that requires Microsoft Access or some other relational database software to be useful.
6. Google  flutrend: Each week, millions of users around the world search for health information online. As one might expect, there are more flu-related searches during flu season, more allergy-related searches during allergy season, and more sunburn-related searches during the summer. One can explore all of these phenomena using Google Insights for Search.
7. Infochimps Data Marketplace: Big datasets of all sorts whether social, geographical or other kinds of data we’ve got just what you need.
8. The Project Gutenberg Etext of Human Genome Project. A huge number of etexts can be used, for example, for word counts.
9. KDnuggets: Data mining Community’s Top Resource. The site lists a number of data repositories
10. San Franscisco data. Contains varied data on crime and other subjects.
11. A good collection of data set packages to practice data mining are listed here.
12. A good list of datasets on a number of subjects is here
13. Indian Govt open data portal
14. World Bank data is here.
15. Financial Data sources: A list of financial data sources can be found here.

Tags: , , ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: