|
![]() |
WebscrapeA Web 'Screen Scraper'To Webscrape Some Random Numbers from random.org Quality Random Numbers are notoriously tricky to generate, so if it's well behaved yet entropy filled numbers that you're after, why not let the experts generate them for you and then simply webscrape them? The hardworking server at random.org allows us to do just that, it returns lists of random numbers in a format that's very easy to screen scrape. The following web page allows the user to specify various parameters which control how the numbers are generated: http://www.random.org/nform.html Entering some suitable values into this form and hitting the "Get Numbers" button results in a GET method like the following being submitted: http://www.random.org/cgi-bin/randnum?num=1&min=0&max=100&col=1 This returns one Random Number between 0 and 100, the data is returned in a simple text format which just contains the number followed an end of line marker, we don't even have to scrape the number out from under from nasty HTML. The Random Number can be scraped by running the following: pscrape -u"www.random.org/cgi-bin/randnum?num=1&min=0&max=100&col=1" -e"(\d+)$" Running this should return a random number like the following: 42 The Regular Expression (\d+)$ reads as one or more [0-9] digits followed by an end-of-line marker, $ represents the end-of-line marker. If we want to retrieve a list 100 random numbers just need update the num parameter and use PageScrape's -m option like this: pscrape -u"www.random.org/cgi-bin/randnum?num=100&min=0&max=100&col=1" -e"(\d+)$" -m This should return one hundred random numbers. It is possible to write these to a file by using PageScrape's -o parameter: pscrape -u"www.random.org/cgi-bin/randnum?num=100&min=0&max=100&col=1" -e"(\d+)$" -o"nums.txt" -m This should result in 100 random numbers between 0 and 100 being written to nums.txt
|