Utilizing the Scratch API to download comments from a Scratch Studio


Studios in Scratch are collections of Projects centered around a specific “theme”

         If we want to analyze comments from a rather large Scratch studio, the best way to accomplish this via the system built into Scratch currently is to click the “load more” button many times and try to search top-level replies with CTRL+F. This strategy is inefficient when you want to perform whole data-set analysis or simply search all the comments for specific keywords. Although the API is considered deprecated, it can still be used to access the information in HTML format. If you want to access a specific page, you can simply use the following URL.

https://scratch.mit.edu/site-api/comments/gallery/146521/?page=InsertPageNumberHere

         The following python3 program can take all the comments from a given studio and sequentially inserts them into a csv file. This format can be opened in most spreadsheet programs. You will need to change the link in the requests.get() command with the specific number associated with the studio.

How to find the number associated with a specific studio

        
There is also a code segment labelled “analysis”, which produces some high-level information about the studio as shown in the spreadsheet below. It will be unsorted, so you will need to navigate to the appropriate sorting tools within your program, most likely excel or google spreadsheets.



Example of the top-level analysis for the S.D.S. Studio "Growing"
Enjoy utilizing the program for your interests. Please note that it may not work for some exceptionally large studios due to Scratch's built-in rate limit. 


"""
Purpose: Track user contributions within the specific studios

Created by makethebrainhappy
"""

import requests
from bs4 import BeautifulSoup
import pandas as pd
import collections

def main():
    #Data Collection Portion
    users = []
    comments = []
    timestamps = []
    pages = 1
    while True:
        html_doc = requests.get("https://scratch.mit.edu/site-api/comments/gallery/146521/?page="+str(pages))
        if html_doc.status_code == 200:
            print('Success!')
        elif html_doc.status_code == 404:
            print('Not Found.')
            break
        soup = BeautifulSoup(html_doc.content, 'html.parser')
        #print(soup.prettify())
        for com in soup.find_all("div", class_="comment"):
            users.append(com.select("div.name a")[0].string)
        for com in soup.find_all("div", class_="comment"):
            comments.append(com.select("div.content")[0].get_text(" ",strip=True))
        for com in soup.find_all("div", class_="comment"):
            timestamps.append(com.select("span.time")[0].get_text(" ",strip=True))
        pages = pages + 1
    d = {"user":users,"comment":comments,"timestamp":timestamps}
    df = pd.DataFrame(data=d)
    df.to_csv("welcomingCommitteeComments.csv",encoding="utf-8")
    
    #Analysis Portion
    newUsers = Counter(users)
    lenComments = []
    for i in comments:
        lenComments.append(len(i))
    newDict = {}
    for j in range(0,len(lenComments)):
        newDict[users[j]] = 0
    for j in range(0,len(lenComments)):
        newDict[users[j]] = newDict[users[j]]+lenComments[j]
    newUsers = collections.OrderedDict(sorted(newUsers.items()))
    newDict = collections.OrderedDict(sorted(newDict.items()))
    avg = []
    for j in range(0,len(newUsers)):
        avg.append(newDict.values()[j]/newUsers.values()[j])
    d = {"user":newUsers.keys(),"Number of Comments":newUsers.values(),"Total Characters in Comments:":newDict.values(),"Average Characters per Comment":avg}
    df = pd.DataFrame(data=d)
    df.to_csv("studioAnalysis.csv")

main()

Credit to apple502j for helping me with beautifulsoup.

1 comment:

  1. In the age of digital technology, the importance of utilizing APIs to create custom solutions and powerful applications is becoming increasingly evident. The Scratch API is no exception. With the Scratch API, developers can build projects that use the Scratch programming language and the Scratch bg3800 vs bg4500 graphical interface.

    One of the most useful applications of the Scratch API is downloading comments from a Scratch Studio. For those unfamiliar with Scratch, a Scratch Studio is a project hub where users can find, share, and collaborate on projects. Studios are typically organized into various categories, such as game design, art, music, etc.

    Using the Scratch API, developers can easily download comments from a Scratch Studio and analyze them for various purposes. This can be useful for getting insights into user behavior, such as popularity of a particular project or the types of comments people are leaving. It can also help developers better understand the user base of their own projects.

    The first step in downloading comments from a Scratch Studio is to open the Scratch API page. Here, you can enter the Studio ID number and get an access token. Once you have the token, you can use it to make a GET request to the Scratch API, which will return the comments associated with the Studio.

    The comments returned by the Scratch API are in a JSON format. This makes it easy to parse the data and use it in whatever way you see fit. For example, you could use the comments to create a sentiment analysis tool to get insights into how people feel about a particular project, or you could use them to generate statistics on the most popular projects in a particular category.

    In conclusion, the Scratch API is a powerful tool that makes it easy to download comments from a Scratch Studio. This can be used in a variety of ways, from sentiment analysis to generating statistics. With the Scratch API, developers can gain valuable insights into user behavior, helping them create better projects and build better user experiences.

    ReplyDelete