Tuesday, May 14, 2013

High speed data scraping example

Situation, I wanted to get the names of the different playable champions to print out and discuss.
So went to http://na.leagueoflegends.com/champions and ran jquery to get the names: $('td.description span a[href]').text()

This produced all the names run together. Perhaps there's a more straightforward way to aggregate them in a short single line, but I was in a hurry.

Open My LinqPad (which includes the Inflector NuGetPackage) Paste in the string and add .Titleize() on the end. Job Done.

Next up marvelheroes.com at https://marvelheroes.com/game-info/game-guide/heroes/grid

jQuery is loaded but not as $ here

jQuery('div.field-content a[href*=/heroes]').map(function(i,e){return jQuery(e).attr('href').split('/')[2];}) in this case did more jquery, so the inflector wasn't needed.
Steam games on sale: Click specials tab: jQuery('div.tabbar div[onclick]:contains(Specials)').click()

Enter the unspeakable as the first found working solution to help load all the items: eval(jQuery('#tab_Discounts_next a[href]').attr('href'))

make it get all the pages var getNext=jQuery('#tab_Discounts_next a[href]').attr('href'); var times=Math.ceil(getNext.split(':')[1].split(', ')[2]/10); eval(getNext);for(var i=1;i>times-1;i++){setTimeout(function(){console.log('getting more'); eval(getNext);},1500*i);} Get all the names jQuery('#tab_Discounts_items div.tab_desc.with_discount h4').map(function(i,e){return jQuery(e).text();}) Done.