Crawling a page with Blackfire Player
Let’s have a look at a use case on the crawling capacities of Blackfire Player. We’ll see how to crawl all of the Symfony versions from the Symfony GitHub repository.
In January, we released the new version of the Blackfire Player. Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses. This blog series gives concrete examples of how the Player features can be used.
It integrates with Blackfire in order to automate performance testing, but can also cover a lot more use cases, without necessarily involving any sort of performance management.
Let’s have a look at a use case on the crawling capacities of Blackfire Player. With the simple DSL below, we can crawl each Symfony version tagged on GitHub, and return the results in a JSON file.
This:
scenario endpoint "http://symfony.com/" set releases [] set github_tags_url "https://api.github.com/repos/symfony/symfony/tags" # get the latest Symfony releases while github_tags_url visit github_tags_url wait 200 expect status_code() == 200 set releases merge(releases, json('[*].name')) set github_tags_url regex('/<([^>]+)>; rel="next"/', header('Link'))
Executed by this:
blackfire-player run github.bkf --json
Gives this:
{ "1": { "releases": [ "vPR12", "vPR11", "vPR10", # not showing all values for the sake of readability in this article "v2.0.0BETA1", "v2.0.0-RC6", "v2.0.0-RC5", "v2.0.0-RC4", "v2.0.0-RC3", "v2.0.0-RC2", "v2.0.0-RC1" ], "github_tags_url": null } }