Reddit post dataset. By performing data cleaning, exploratory data anal...
Reddit post dataset. By performing data cleaning, exploratory data analysis (EDA), and visualizations, the project aims to provide actionable insights into Reddit user behavior and community interactions. It includes 100 recent posts, all comments (including sub-comments) on those posts, user details for authors involved in the discussion, and additional posts by those users. Dataset Structure Data Instances A data point is a post or a comment. Reddit post collected from nineteen top subreddits Something went wrong and this page crashed! If the issue persists, it's likely a problem on our side. In our time working with companies in the machine learning field, we at iMerit have found many datasets shared on Reddit to be tremendously useful when training a machine learning model. We would like to show you a description here but the site won’t allow us. Dataset containing Reddit Posts and Comments from various different subreddits. The data were identified from a large corpus using a multi-stage filtering pipeline combining keyword retrieval, LLM-based validation, and human We’re on a journey to advance and democratize artificial intelligence through open source and open science. Dataset Description: The ConvoKit Subreddit Corpus is a collection of user comments from various subreddits on Reddit, gathered over time to facilitate research in conversational analysis and sociolinguistics. This project explores a dataset of Reddit posts to uncover insights into user engagement, popular topics, and trends across various subreddits. tqwhbwllpjbempjkhgzsytjrpdcfnhcrqsmvbezqvwdmdrf