Data science is all about finding interesting insights and stories from that data. And you can say data is crucial for data science. Your data science project is mostly dependent upon the goodness of the dataset. That’s why in this article I am gonna share Free Public Datasets for Your Data Science Project.
Thankfully there are various online data sources available where you can get free open source datasets for your project. You just need to download the datasets and use them for your project.
So without further ado, let’s get started-
Free Public Datasets for Your Data Science Project
1. Data.gov
Data.gov is the repository of the US government which you can use for your research and data science projects such as data visualization, mobile applications, etc. You can directly use some of the datasets without even registering on the site. But some datasets require licensing agreements before downloading the dataset. You can search for a specific data set or you can find datasets in the topic section-
2. Google Cloud Public Datasets
Google has a cloud hosting service named Google Cloud Platform. Where you can explore datasets by using a BigQuery tool. For obtaining better patterns and insights, you can create data visualizations and interactive dashboards by using Google Data Studio.
You just need to create an account on GCP for accessing the data. GitHub, United States Census Bureau, NASA, BitCoin, US Department of Transportation, etc are the data providers on GCP.
3. Kaggle
Kaggle is one of the famous platforms for data science, and you can download approx 68,000 public datasets on Kaggle free. In Kaggle you need to create an account and then you can search for any specific dataset in the search bar. You can also donate datasets on Kaggle and other community members can vote and run Kernel/scripts on them.
In Kaggle, you can also take part in various competitions and can download the competition data sets from Kaggle.
4. UCI Machine Learning Repository
The UCI Repository has public datasets available for machine learning and data science. The best thing in UCI Repository is that datasets are tagged with different categories such as classification, regression, recommender system, etc.
These categories will make your findings easier. The datasets available on UCI are contributed by various people. So if you are a machine learning practitioner, then you should check the UCI Machine Learning Repository.
You don’t need to register on the site, you can directly download the datasets from the UCI Machine Learning repository.
5. AWS Public Data sets
Amazon has a huge amount of datasets available on their open data registry. You can easily download the datasets and use them for your project. You can also analyze the data on the Amazon Elastic Compute Cloud (Amazon EC2).
Full Enron email dataset, Google Books n-grams, NASA NEX datasets, Million Songs dataset are some of the popular datasets available on Amazon.
6. Quandl
Quandl is the best platform for financial, economic, and alternative data. Some of the datasets are free, but some require purchase. You can use Quandl datasets for stock price prediction or economic indicators prediction.
Stock Exchange data from India is freely available on Quandl. If you search properly, you will get some good free datasets.
7. The World Bank
The World Bank is a global development organization and provides open datasets. In the World Bank, you will find several resources for datasets such as DataBank, Open Data Catalog, Microdata library, etc.
You can also find datasets by Regions and Countries-
8. Indian Government OpenDataset
data.gov.in is the website by the Government of India, where you can find free datasets from various industries such as climate, health care, transport, education, economy, etc.
9. Earthdata
Earthdata is created by NASA and provides datasets related to the Earth and Space. So if you are looking for such kinds of datasets, then Earthdata is the perfect place for you. In Earthdata, you will find Earth’s atmosphere, oceans, solar flares, cryosphere, geomagnetism based datasets.
In Earthdata, you will find various sections such as find data, use data, visualize data, etc.
10. Awesome Public Dataset
This is a GitHub Repository that has listed datasets from various domains such as Agriculture, Biology, Climate & Weather, Complex Networks, Computer Networks, Economics, Education, Finance, etc. Most of the datasets are available freely.
And here the list end. So these are 10 Free Public Datasets for Your Data Science Project. I would suggest you bookmark this article for future referrals. Now it’s time to wrap up.
Conclusion
In this article, I tried to cover the 10 Free Public Datasets for Your Data Science Project. If you have any doubts or questions, feel free to ask me in the comment section.
All the Best!
Enjoy Learning!
You May Also Interested In
15 Best Online Courses for Data Science for Everyone in 2024
8 Best Data Engineering Courses Online- Complete List of Resources
Best Course on Statistics for Data Science to Master in Statistics
8 Best Tableau Courses Online- Find the Best One For You!
8 Best Online Courses on Big Data Analytics You Need to Know in 2024
Best SQL Online Course Certificate Programs for Data Science
7 Best SAS Certification Online Courses You Need to Know
Data Analyst Online Certification to Become a Successful Data Analyst
15 Best Books on Data Science Everyone Should Read in 2024
Thank YOU!
Explore More about Data Science, Visit Here
Subscribe For More Updates!
[mc4wp_form id=”28437″]
Though of the Day…
‘ It’s what you learn after you know it all that counts.’
– John Wooden
Written By Aqsa Zafar
Founder of MLTUT, Machine Learning Ph.D. scholar at Dayananda Sagar University. Research on social media depression detection. Create tutorials on ML and data science for diverse applications. Passionate about sharing knowledge through website and social media.