UHG
Search
Close this search box.

Complete Tutorial On Twint: Twitter Scraping Without Twitter’s API

Twint is an open-source python library that is used for twitter scraping i.e we can use twint in order to extract data from twitter and that too without using the twitter API.

Share

Twint Twitter
Table of Content

Web Scraping allows us to download data from different websites over the internet to our local system. It is data mining from different online portals using Hypertext Transfer Protocols and uses this data according to our requirements. Many companies use this for data harvesting and for creating search engine bots. 

Python has a large variety of packages/modules that can help in the process of web scraping like beautiful soup, selenium. Several libraries are there which can automate the process of web scraping like Autoscraper. All these libraries use different APIs through which we can scrape data and store it into a data frame in our local machine.

Twint is an open-source python library that is used for twitter scraping i.e we can use twint in order to extract data from twitter and that too without using the twitter API. There are certain features of twint which makes it more useable and unique from other twitter scraping API, namely:

  • Twitter API has a limit of fetching only 3200(last) tweets while twint has no limit of downloading tweets, it can download almost all the tweets.
  • Easy to use and very fast.
  • No initial Sign-in or Sign-up required for fetching data.  

Twint can be used to scrape tweets using different parameters like hashtags, usernames, topics, etc. It can even extract information like phone number and email id’s from the tweets.

In this article, we will explore twint and see what different functionalities it offers for scraping data from twitter.

Implementation: 

We will start by installing twint using pip install twint.

  1. Importing required libraries

We will be scraping data from twitter using twint so we will import twint other than this we need to import net_asyncio which will handle all the notebook and runtime errors. Also, we will initiate the net_syncio in this step only.

import twint

import nest_asyncio

net_asyncio.apply()

  1. Configuring Twint

We need to scrape data from twitter using twint before that we need to configure the twint object and call it whenever required. 

t = twint.Config()

Now let us start scraping different types of data from twitter.

  1. Scraping Data
  1. Followers on Twitter

Here, we will see how we can download the names of the followers of a particular user by using their username. Here I am using my own twitter username.

t.Username = "Himansh70809561"

twint.run.Followers(t)

Followers

Here you can see a list of my followers on twitter because I used my username, similarly, you can use the different usernames of different users and download the follower’s name.

  1. Storing info to Dataframe

We can also store the information into a data frame. Let us see how to store the follower’s details in a data frame.

t.Limit = 30

t.Username = 'Analyticsindiam'

t.Pandas = True

twint.run.Followers(t)

follow_df = twint.storage.panda.User_df

Followers in Dataframe

Here we saw that the top 30 followers are stored in a data frame. We can set the number of followers to the desired number. 

  1. Extracting tweets with a particular word

Here we will try and extract all tweets which have a particular word in them which we define.

t.Search = "analytics"

t.Store_object = True

t.Limit = 10

twint.run.Search(t)

tlist = t.search_tweet_list

print(tlist)

Tweets with particular word

The output contains tweet from different users with their usernames and tweet along with the date when a tweet is published.

  1. Tweets of a particular User

We can also extract tweets from different users by entering their username as the parameter.

t.Search = "from:@Analyticsindiam"

t.Store_object = True

t.Limit = 10 

twint.run.Search(t)

tlist = t.search_tweet_list

Tweets from a particular user

Here we can see some recent tweets from Analytics India Magazine along with their username and date on which they were published.

These are some of the ways with which we can extract data or scrape data from twitter using twint. Twint contributors are actively contributing to making it better and better day by day.

Conclusion:

In this article, we saw how we can use twint to extract data from twitter. We started with scraping the followers a person has on twitter further we saw how we can store them in a data frame. We also saw how to extract tweets with a particular string or tweets from a particular user. Twint is easy to easy and is blazingly fast with frequent updates.

📣 Want to advertise in AIM? Book here

Related Posts
19th - 23rd Aug 2024
Generative AI Crash Course for Non-Techies
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.
Flagship Events
Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.