Bash - Parsing CISCO Threat Outbreak Alerts
The Cisco Threat Outbreak Alerts webpage is actually pretty good intel. Fairly quick information that gives general information about malicious e-mail campaigns beeing seen worldwide. Cisco IronPorts is extremely widespread so the data they see is credible. Why not use this intel about known bad message templates in some kind of security software? This pulls out the subject lines, but could be modified to extract whatever information from the page that you like. Note, the resulting data may have duplicates or have some garbage info, so you should review the output before just plopping it into something.
#!/bin/bash #download the index pages & parse out post urls echo "" > pages echo "" > results.txt for j in {1..10} do wget -O index "http://tools.cisco.com/security/center/threatOutbreak.x?i=77¤tPage=$j&sortType=d&recordsPerPage=100&pageSize=100&pageNo=$j" sed -n 's@.*viewThreatOutbreakAlert.x?alertId=\(.*\)\" style.*@\1@p' index >> pages done #retrieve the individual pages based on the list of post urls while read line do wget -O temp http://tools.cisco.com/security/center/viewThreatOutbreakAlert.x?alertId=$line #this is pulling the subject line from each entry and storing to results.txt sed -n 's@<blockquote>Subject:.*<strong>\(.*\)<\/strong><br />@\1@p' temp >> results.txt done < pages #cleanup rm pages rm temp rm index