This is actually kind of amazing – some friends of mine (let’s call them ‘the musketeers’) entered a competition; they had to record a video, and it would be judged in 2 different competitions. The main event was with formal judges and large prizes, while the side event was some form of “people’s choice” awards, were people could go to their website and vote for their favourite videos. This one had some smaller gift voucher type prizes, and is (to us at least) much more interesting.

As you can imagine, after the voting opened, the wily musketeers figured out that some people were being a little dishonest with their voting – first, when the vote count started to significantly outnumber the page views, and second when one video went from a few hundred to thousands overnight. Most of the videos are only getting a few hundred votes each, so thousands is a little suspicious, especially overnight. A single computer was supposed to get a single vote – however, the site used cookies to determine whether a vote had been cast, which can be deleted easily. After figuring this out, it was assumed that some of the groups where either manually deleting cookies and re-voting, or had written a little script to reload the page and click the button. So the musketeers contacted the organisers.

But the votes kept going up – so the musketeers decided on a different plan of attack. Since the competition was a bit meaningless and farcical at this point, they decided to add votes to each of the videos so that they all had the same vote count; essentially leveling the playing field. There were a few key things they needed to make this work.

1 – make a (hopefully more elegant) voting script

This was what they started with, a script that would just sit and press the vote button. Instead of going and loading the page over and over, one of them figured out a way to simulate the button press, and just send it to the website’s php script underlying the button.

import requests as r
def vote():
    r.post("website_address",{"action":"vote_for_video","video_id":XXXX})

The code looks a little like this. I have no idea how they figured this out, but it certainly works. “post” is one of the most common methods in HTML, and is used to send little packets of information to web pages. Evidently, since their website mechanism wasn’t protected or anything, you could just prod it in the right spot and get a vote. The advantage of this over loading the web page is that it is significantly faster.

2 – Find the vote count and video ID’s

This bit was where I came in – One of the musketeers approached me during a free period, and said “hey wanna help me with this programming thing” – it was either that or homework, and I know which I’d prefer to do. I went online and found this tutorial on scraping data from web pages, and used it to cobble this together.

import requests
from lxml import html
def recordvotes():
        page = requests.get('website_address')
        tree = html.fromstring(page.content)
        one = int(tree.xpath('//*[@id="text-block-8"]/div[2]/div[1]/div/div[1]/div[7]/span/text()')[0])
        return one

The code looks something like this. After importing libraries, it pulls down the website page (with get) and stores it in a nice ordered format. It then looks through the html for the piece it wants. I know that the “tree.xpath” line looks like gibberish, so i’ll explain what it is, and what you need to do. From what I know, it’s a mechanism that points to the specific element of the html page and returns that html object to you – so if you want some specific piece of data that sits on the page somewhere, but constantly changes (like the vote count), you use xpaths. To use it, you go to a webpage (preferably in chrome), highlight the element of the page you’re interested in, right click, and click inspect – the code inspector will come up on the side. Right click on the code, click “copy”, then “copy xpath”. Then type the tree.xpath(‘‘) part, and paste it in. sometimes you may need to put an identifier on the end to get the bit you want – for example, /text(), or /@onclick. Google is your friend in this department.

After pulling all ten of the vote counts out, the function returns them, and it looks for the video IDs. It uses a very similar process, but instead grabs the html object with the video in it – it uses tree.xpath(‘insert_xpath/@onclick’), grabbing the web address for the video – the web address contains the ID, so we parse it to get the ID back out.

3 – Equalizing the votes

For this, we simply check the vote count of first place, and set that as the upper limit – we bump number 2, then 3, then so on, up to that vote count.

if votelist[0]- votelist[1] > 0 :
    for i in range(0,len(votelist)):
        dif = votelist[0] - votelist[i]
        for iterator in range(0,int(dif)):
            requests.post("website_address",{"action":"vote_for_video","video_id":vidids[i]})
            if iterator%1000 == 0:
                print("The video ID currently is: " + vidids[i] + ', Vote Count: ' + str(votelist[i] + iterator))

So, what now?

At the moment, they’re planning on setting a computer up to run the script to equalize the votes.

Unfortunately, there is one final twist in the tale.

The musketeers video is now number one, and keeps rising. How did this happen, when they were trying to stop other people from cheating? From what they’ve told me, one of the musketeers mentioned the competition on some large group chat, and someone in the chat (the Erdos number rises…) decided to ‘help them out’ by building a vote spam bot to make them win. This doesn’t really help the musketeers case of trying to be on the moral high ground; they don’t want to be winning with all these fake votes, but someone else, somewhere else is bumping them up, and they aren’t a hundred percent sure who it is. For now, they seem to have stopped; the musketeers are scrambling to bring the other vote counts up to the same level, so that the competition might regain some semblance of being fair.

Advertisements