64BIT - The Collection

Bash - Parsing CISCO Threat Outbreak Alerts

The Cisco Threat Outbreak Alerts webpage is actually pretty good intel. Fairly quick information that gives general information about malicious e-mail campaigns beeing seen worldwide. Cisco IronPorts is extremely widespread so the data they see is credible.  Why not use this intel about known bad message templates in some kind of security software? This pulls out the subject lines, but could be modified to extract whatever information from the page that you like. Note, the resulting data may have duplicates or have some garbage info, so you should review the output before just plopping it into something.

#!/bin/bash
#download the index pages & parse out post urls
echo "" > pages
echo "" > results.txt
for j in {1..10}
do
wget -O index "http://tools.cisco.com/security/center/threatOutbreak.x?i=77&currentPage=$j&sortType=d&recordsPerPage=100&pageSize=100&pageNo=$j"
sed -n 's@.*viewThreatOutbreakAlert.x?alertId=\(.*\)\" style.*@\1@p' index >> pages
done

#retrieve the individual pages based on the list of post urls
while read line
do
wget -O temp http://tools.cisco.com/security/center/viewThreatOutbreakAlert.x?alertId=$line

#this is pulling the subject line from each entry and storing to results.txt
sed -n 's@<blockquote>Subject:.*<strong>\(.*\)<\/strong><br />@\1@p' temp >> results.txt
done < pages

#cleanup
rm pages
rm temp
rm index