Crawling a page with Blackfire Player
Let’s have a look at a use case on the crawling capacities of Blackfire Player. We’ll see how to crawl all of the Symfony versions from the Symfony GitHub repository.
In January, we released the new version of the Blackfire Player. Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses. This blog series gives concrete examples of how the Player features can be used.
It integrates with Blackfire in order to automate performance testing, but can also cover a lot more use cases, without necessarily involving any sort of performance management.
Let’s have a look at a use case on the crawling capacities of Blackfire Player. With the simple DSL below, we can crawl each Symfony version tagged on GitHub, and return the results in a JSON file.
This:
scenario
endpoint "http://symfony.com/"
set releases []
set github_tags_url "https://api.github.com/repos/symfony/symfony/tags"
# get the latest Symfony releases
while github_tags_url
visit github_tags_url
wait 200
expect status_code() == 200
set releases merge(releases, json('[*].name'))
set github_tags_url regex('/<([^>]+)>; rel="next"/', header('Link'))
Executed by this:
blackfire-player run github.bkf --json
Gives this:
{
"1": {
"releases": [
"vPR12",
"vPR11",
"vPR10",
# not showing all values for the sake of readability in this article
"v2.0.0BETA1",
"v2.0.0-RC6",
"v2.0.0-RC5",
"v2.0.0-RC4",
"v2.0.0-RC3",
"v2.0.0-RC2",
"v2.0.0-RC1"
],
"github_tags_url": null
}
}