Next case

Navigation

Reddit Web Scraper

Role / Services

Software Development

Credits

David Nwachukwu

Location & year

The United Kingdom ©

2022

As a software engineer, I love working on solo projects that challenge me and help me acquire new skills. One of my latest projects involved web scraping Reddit for content and turning it into short videos. This project enabled me to use a wide range of technical skills, including web scraping, text-to-speech conversion, video editing, and more.

To begin with, I used the Python programming language to write the code for the project. Specifically, I used the following libraries: ‘selenium’, ‘praw’, ‘pandas’, ‘gtts’, ‘moviepy’, ‘mutagen’, and ‘pillow’. These libraries allowed me to automate web browsing, access the Reddit API, create text-to-speech audio files, edit videos, and manipulate images.

To start, I imported the necessary libraries and set up the options for the Chrome browser using the ‘Options’ class. I also created a ‘Reddit’ object from the ‘praw’ library, which allowed me to access the Reddit API and retrieve the hot posts from any subreddit of my choosing.

I used the ‘selenium’ library to automate web browsing and interact with the Reddit website. I first navigated to the desired subreddit and retrieved the hot posts using the ‘Reddit’ object. Then, for each post, I opened the post's link in a new browser window and saved a screenshot of the page. I used the ‘PIL’ library to crop the screenshot to the desired dimensions and save it as an image file.

Next, I used the ‘gtts’ library to convert the post's title and body text into an audio file in the English language. I then used the ‘mutagen’ library to determine the length of the audio file in seconds.

Finally, I used the ‘moviepy’ library to create a short video clip by combining the audio and image files. I used the ‘VideoFileClip’ and ‘AudioFileClip’ classes to load the video and audio files, respectively. I then used the ImageClip class to create an ‘image clip’ from the cropped image file. Finally, I used the ‘CompositeVideoClip’ class to combine the image clip and the audio clip and create the final video.

One of the biggest challenges I faced during this project was dealing with the Reddit login process. To overcome this challenge, I used the ‘selenium’ library to automate the login process and store the session cookies. This allowed me to bypass the login process and scrape the content without any issues.

In conclusion, this project allowed me to acquire a wide range of technical skills, from web scraping to text-to-speech conversion and video editing. It also gave me the opportunity to work on a solo project and improve my problem-solving and programming skills.

Hello

Hello

Hello

Hello

Hello

My

My

My

Name

Name

Name

Is

Is

Is

David

David

David

Nwachukwu

Nwachukwu

Nwachukwu

Nwachukwu

Nwachukwu

And

And

And

Welcome

Welcome

Welcome

To

To

To

My

My

My

Portfolio

Portfolio

Portfolio

▂▃▅▇█▓▒░۩۞۩●๑▬๑●۩۩●๑▬●๑۩۞۩░▒▓█▇▅▃▂

▅▇█▓▒░۩۞۩●๑▬๑●۩▬▬▬▬۩●๑▬●๑۩۞۩░▒▓█▇▅

█▓▒░۩۞۩●๑▬๑●۩▬▬▬▬▬▬▬▬۩●๑▬●๑۩۞۩░▒▓█

▒░۩۞۩●๑▬๑●۩▬▬▬▬▬▬▬▬▬▬▬▬۩●๑▬●๑۩۞۩░▒

۩۞۩●๑▬๑●۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩●๑▬●๑۩۞۩

۩●๑▬๑●۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩●๑▬●๑۩

๑▬๑●۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩●๑▬●

▬๑●۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩●๑▬

๑●۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩●๑

۩▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬۩

●▬▬▬▬▬▬▬▬▬▬▬▬▬▬๑۩۩๑▬▬▬▬▬▬▬▬▬▬▬▬▬▬●

●▬▬▬▬▬▬▬▬▬▬▬▬▬▬๑۩۩๑▬▬▬▬▬▬▬▬▬▬▬▬▬▬●

Loading...

Navigation

Socials

Role / Services

Credits

Location & year