Today I had a 13 MB WSDL file that I had to diagnose what HTTP(S) calls were made by the service. Because the file was so large, it made it difficult. Grep to the rescue!
First, I got all of the HTTP calls using a regular expression:
grep -o -e http://[^[:space:]\”]* file
This brought the 13 MB file down to 3 MB’s. But in looking at the file, there were a ton of repeats included. So how do I limit the results to unique URL’s?
grep -o -e http://[^[:space:]\”]* file | sort | uniq > portal_url_calls.txt
After running this, the file was now 41k, and had all of the unique URL’s sorted. Awesome!
I hope this helps someone else down the road!