Contents
    Scraping Google Search With Python and Beautifulsoup4

    Scraping Google Search With Python and Beautifulsoup4

    -- views

    Learn how to Scraping Google and optimize search using google search operators

    Introduction

    Web scraping is a technique for obtaining usable data for a specific purpose. In essence, the website contains a large amount of data, if you want to get that data then one way is by web scraping.

    In this post, we will try scraping urls on Google and learn how to optimize searches using the Google search operator.

    Meme

    Source Image: Automating Social Media Contests with Web Scraping | by Jason Yip | Towards Data Science

    Getting Started

    Clone this repository by executing the following command:

    $ git clone https://github.com/jagadyudha/google-scraper
    

    Install the libraries that are required by executing the following command:

    # Python3
    $ pip3 install -r requirements.txt
    
    # or
    
    # Python2
    $ pip install -r requirements.txt
    

    Before we jump into how to run the project, we need to know how this project works.

    1. Identify google search url
    2. Collect data from urls
    3. Find div tag with class g
    4. FInd a tag inside class that we found in step 3

    How to Run the Project

    To run the code is quite simple. Just write the following command:

    # Python3
    $ python3 main.py
    
    # or
    
    # Python2
    $ python main.py
    

    Input pages and input data will be displayed after executing the above command. Input pages is the number of Google pages that you want to scrape it, while input data is the keyword you want to search for.

    Output

    Optimize Search With Google Search Operators

    Google search operators are often used to find information that is specific, allowing for accurate search results even when the information is tough to track down.

    We may also use the Google search operator in this project. For example, I will find a pdf file with the keyword learning Python.

    So, I will search with keyword filetype:pdf intext:learning Python

    search with google search operators

    Unfortunately, if you use Google Search Operators too often, it can bring up captchas.

    Use Google Search Operators too often

    Google Search Operators Cheat Sheet

    Previously, I have given an example of using Google Search Operators. For more details, you can check out the following cheat sheet: Google Search Operators Cheat Sheet (notion.site)

    Conclusion

    With the help of this tool, we can do scraping automatically without the need to copy URLs one by one. However, there is an unsolved problem with captcha when using it too often.


    Post Reactions
    LIKE
    LOVE
    Wow
    YAY
    Contributors

    The writing on this website may contain errors in grammar, punctuation, etc. Please make a contribution here